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Preface 


This “book” grew out of a series of twenty five lecture notes for a sophomore 
linear algebra class taught at the University of California, Davis. The audi- 
ence was primarily engineering students and students of pure sciences, some 
of whom may go on to major in mathematics. It was motivated by the lack of 
a book that taught students basic structures of linear algebra without overdo- 
ing mathematical rigor or becoming a mindless exercise in crunching recipes 
at the cost of fundamental understanding. In particular we wanted a book 
that was suitable for all students, not just math majors, that focussed on 
concepts and developing the ability to think in terms of abstract structures 
in order to address the dizzying array of seemingly disparate applications 
that can all actually be addressed with linear algebra methods. 

In addition we had practical concerns. We wanted to offer students a 
online version of the book for free, both because we felt it our academic 
duty to do so, but also because we could seamlessly link an online book to 
a myriad of other resources-in particular WeBWork exercises and videos. 
We also wanted to make the LaTeX source available to other instructors 
so they could easily customize the material to fit their own needs. Finally, 
we wanted to restructure the way the course was taught, by getting the 
students to direct most of their effort at more difficult problems where they 
had to think through concepts, present well-thought out logical arguments 
and learn to turn word problems into ones where the usual array of linear 
algebra recipes could take over. 


How to Use the Book 


At the end of each chapter there is a set of review questions. Our students 
found these very difficult, mostly because they did not know where to begin, 
rather than needing a clever trick. We designed them this way to ensure that 
students grappled with basic concepts. Our main aim was for students to 
master these problems, so that we could ask similar high caliber problems 
on midterm and final examinations. This meant that we did have to direct 
resources to grading some of these problems. For this we used two tricks. 
First we asked students to hand in more problems than we could grade, and 
then secretly selected a subset for grading. Second, because there are more 
review questions than what an individual student could handle, we split the 
class into groups of three or four and assigned the remaining problems to them 


for grading. Teamwork is a skill our students will need in the workplace; also 
it really enhanced their enjoyment of mathematics. 

Learning math is like learning to play a violin—many “technical exercises” 
are necessary before you can really make music! Therefore, each chapter has 
a set of dedicated WeBWork “skills problems” where students can test that 
they have mastered basic linear algebra skills. The beauty of WeBWork is 
that students get instant feedback and problems can be randomized, which 
means that although students are working on the same types of problem, 
they cannot simply tell each other the answer. Instead, we encourage them 
to explain to one another how to do the WeBWork exercises. Our experience 
is that this way, students can mostly figure out how to do the WeBWork 
problems among themselves, freeing up discussion groups and office hours for 
weightier issues. Finally, we really wanted our students to carefully read the 
book. Therefore, each chapter has several very simple WeBWork “reading 
problems”. These appear as links at strategic places. They are very simple 
problems that can answered rapidly if a student has read the preceding text. 


The Material 


We believe the entire book can be taught in twenty five fifty minute lectures 
to a sophomore audience that has been exposed to a one year calculus course. 
Vector calculus is useful, but not necessary preparation for this book, which 
attempts to be self-contained. Key concepts are presented multiple times, 
throughout the book, often first in a more intuitive setting, and then again 
in a definition, theorem, proof style later on. We do not aim for students 
to become agile mathematical proof writers, but we do expect them to be 
able to show and explain why key results hold. We also often use the review 
exercises to let students discover key results for themselves; before they are 
presented again in detail later in the book. 

Linear algebra courses run the risk of becoming a conglomeration of learn- 
by-rote recipes involving arrays filled with numbers. In the modern computer 
era, understanding these recipes, why they work, and what they are for is 
more important than ever. Therefore, we believe it is crucial to change the 
students’ approach to mathematics right from the beginning of the course. 
Instead of them asking us “what do I do here?”, we want them to ask “why 
would I do that?” This means that students need to start to think in terms 
of abstract structures. In particular, they need to rapidly become conversant 
in sets and functions—the first WeBWorkK set will help them brush up these 
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skills. 

There is no best order to teach a linear algebra course. The book has 
been written such that instructors can reorder the chapters (using the La- 
TeX source) in any (reasonable) order and still have a consistent text. We 
hammer the notions of abstract vectors and linear transformations hard and 
early, while at the same time giving students the basic matrix skills neces- 
sary to perform computations. Gaussian elimination is followed directly by 
an “exploration chapter” on the simplex algorithm to open students minds 
to problems beyond standard linear systems ones. Vectors in R” and general 
vector spaces are presented back to back so that students are not stranded 
with the idea that vectors are just ordered lists of numbers. To this end, we 
also labor the notion of all functions from a set to the real numbers. In the 
same vein linear transformations and matrices are presented hand in hand. 
Once students see that a linear map is specified by its action on a limited set 
of inputs, they can already understand what a basis is. All the while students 
are studying linear systems and their solution sets, so after matrices deter- 
minants are introduced. This material can proceed rapidly since elementary 
matrices were already introduced with Gaussian elimination. Only then is a 
careful discussion of spans, linear independence and dimension given to ready 
students for a thorough treatment of eigenvectors and diagonalization. The 
dimension formula therefore appears quite late, since we prefer not to elevate 
rote computations of column and row spaces to a pedestal. The book ends 
with applications—least squares and singular values. These are a fun way to 
end any lecture course. It would also be quite easy to spend any extra time 
on systems of differential equations and simple Fourier transform problems. 


One possible distribution of twenty five fifty minute lectures might be: 


Chapter Lectures 
What is Linear Algebra? 
SystemsofLinearEquations 

The Simplex Method 

Vectors in Space, n-Vectors 

Vector Spaces 

Linear Transformations 

Matrices 

Determinants 

Subspaces and Spanning Sets 

Linear Independence 

Basis and Dimension 

Eigenvalues and Eigenvectors 
Diagonalization 

Orthonormal Bases and Complements 
Diagonalizing Symmetric Matrices 
Kernel, Range, Nullity, Rank 
LeastSquaresandSingular Values 


PRP NFP NFP RNY WRF RF B&H 


Creating this book has taken the labor of many people. Special thanks are 
due to Katrina Glaeser and Travis Scrimshaw for shooting many of the videos 
and LaTeXing their scripts. Rohit Thomas wrote many of the WeBWork 
problems. Bruno Nachtergaele and Anne Schilling provided inspiration for 
creating a free resource for all students of linear algebra. Dan Comins helped 
with technical aspects. A University of California online pilot grant helped 
fund the graduate students who worked on the project. Most of all we thank 
our students who found many errors in the book and taught us how to teach 
this material! 

Finally, we admit the book’s many shortcomings: clumsy writing, low 
quality artwork and low-tech video material. We welcome anybody who 
wishes to contribute new material—WeBWork problems, videos, pictures— 
to make this resource a better one and are glad to hear of any typographical 
errors, mathematical fallacies, or simply ideas how to improve the book. 


David, Tom and Andrew 
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What is Linear Algebra? 


Many difficult science problems can handled using the powerful, yet easy to 
use, mathematics of linear algebra. Unfortunately, because the subject (at 
least for those learning it) requires seemingly arcane and tedious computa- 
tions involving large arrays of number known as matrices, the key concepts 
and the wide applicability of linear algebra are easily missed. Therefore, be- 
fore we equip you with matrix skills, let us give some hints about what linear 
algebra is. The takeaway message is 


Linear algebra is the study of vectors and linear transformations. 


In broad terms, vectors are things you can add and linear transformations are 
very special functions of vectors that respect vector addition. To understand 
this a little better, lets try some examples. Please be prepared to change the 
way you think about some familiar mathematical objects and keep a pencil 
and piece of paper handy! 


1.1 Vectors? 


Here are some examples of things that can be added: 


Example 1 (Vector Addition) 


(A) Numbers: If z and y are numbers, then so is x + y. 
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What is Linear Algebra? 


1 0 1 
(B) 3-vectors: | 1] + {1] = [2]. 
0 1 1 


(C) Polynomials: If p(x) = 1 + £ — 2x? + 323 and q(x) = z + 32? — 3x? + +, then 
their sum p(x) + q(x) is the new polynomial 1 + 2x + x? + z4. 


1 


(D) Power series: If f(x) = 1+a+3)274 qr? H+- and g(x) = 1-r+y2 
then f(x) + g(x) =1+ 3,27 + ġar’. is also a power series. 


(E) Functions: If f(x) = e” and g(x) =e *, then their sum f(x) + g(x) is the new 
function 2 cosh x. 


Because they “can be added”, you should now start thinking of all the above 
objects as vectors! In Chapter 5 we will give the precise rules that vector 
addition must obey. In the above examples, however, notice that the vector 
addition rule stems from the rules for adding numbers. 

When adding the same vector over and over, for example 


CPL, LAC y eee Ly ey 


we will write 


Og Ae Lest 
respectively. For example, 
1 1 1 1 1 4 
4{1j=[{1)4+]1])4+ ]1]}]+]1]= 14 
0 0 0 0 0 0 


Defining 4% = x +x +x + 2 is fine for integer multiples, but does not help us 


make sense of ir. For the different types of vectors above, you can probably 


guess how to multiply a vector by a scalar, for example: 


1 

1 3 

1 
s|il=]3 
0 0 


In any given problem that you are planning to describe using vectors, you 
need to decide on a way to add and scalar multiply vectors. In summary: 
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1.2 Linear Transformations 


Vectors are things you can add and scalar multiply. 


1.2 Linear Transformations 


In calculus classes, the main subject of investigation was functions and their 
rates of change. In linear algebra, functions will again be focus of your 
attention, but now functions of a very special type. In calculus, you probably 
encountered functions f(x), but were perhaps encouraged to think of this a 
machine “f”, whose input is some real number x. For each input x this 
machine outputs a single real number f(x): 


F(x) 


In linear algebra, the functions we study will take vectors, of some type, 
as both inputs and outputs. We just saw that vectors were objects that could 
be added or scalar multiplied—a very general notion—so the functions we 
are going study will look novel at first. So things don’t get too abstract, here 
are five questions that can be rephrased in terms of functions of vectors: 


Example 2 (Functions of Vectors in Disguise) 
(A) What number z solves 10x = 3? 
1 0 
(B) What vector u from 3-space satisfies the cross product equation | 1] xu = | 1]? 
1 


0 
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(C) What polynomial p(x) satisfies J| ply)dy = 0 and i yp(y)dy = 1? 
(D) What power series f(x) satisfies et f(x) — 2f (x) = 0? 
(E) What number z solves 4x? = 1? 


For part (A), the machine needed would look like: 


which is just like a function f(a) from calculus that takes in a number x and 
spits out the number f(x) = 10x. For part (B), we need something more 
sophisticated along the lines of: 


z z 
y =z p 
A y—r 


whose inputs and outputs are both 3-vectors. You are probably getting the 
gist by now, but here is the machine needed for part (C): 


plz) 


f> uply)dy 


Here we input a polynomial and get a 2-vector as output! 

By now you may be feeling overwhelmed, surely the study of functions as 
general as the ones exhibited is very difficult. However, in linear algebra, we 
will restrict ourselves to a very important, yet much simpler, class of func- 
tions of vectors than the most general ones. Let’s use the letter L for these 
functions and think again about vector addition and scalar multiplication. 
Lets suppose v and u are vectors and c is a number. Then we already know 
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that u +v and cu are also vectors. Since L is a function of vectors, if we 
input u into L, the output L(u) will also be some sort of vector. The same 
goes for L(v), L(u + v) and L(cu). Moreover, we can now also think about 
adding L(u) and L(v) to get yet another vector L(w) + L(v) or of multiplying 
L(u) by c to obtain the vector cL(u). Perhaps a picture of all this helps: 


L 


a 


The “blob” on the left represents all the vectors that you are allowed to input 
into the function L, and the blob on the right denotes the corresponding 
outputs. Hopefully you noticed that there are two vectors apparently not 
shown on the blob of outputs: 


Liu) + Lv) & cL(u). 


You might already be able to guess the values we would like these to take. If 
not, here’s the answer, it’s the key equation of the whole class, from which 
everything else follows: 


1. Additivity: 
L(u +v) = L(u) + Liv). 


2. Homogeneity: 
L(cu) = cL(u). 
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What is Linear Algebra? 


Most functions of vectors do not obey this requirement; linear algebra is the 
study of those that do. Notice that the additivity requirement says that the 
function L respects vector addition: it does not matter if you first add u 
and v and then input their sum into L, or first input u and v into L sepa- 
rately and then add the outputs. The same holds for scalar multiplication—try 
writing out the scalar multiplication version of the italicized sentence. When 
a function of vectors obeys the additivity and homogeneity properties we say 
that it is linear (this is the “linear” of linear algebra). Together, additivity 
and homogeneity are called linearity. Other, equivalent, names for linear 
functions are: 


Linear 
Operator 


SS map 


EJ 
@, 


The questions in cases (A-D) of our example can all be restated as a single 


equation: 
Lv = w l 


where v is an unknown and w a known vector, and L is a linear transfor- 
mation. To check that this is true, one needs to know the rules for adding 
vectors (both inputs and outputs) and then check linearity of L. Solving the 
equation Lv = w often amounts to solving systems of linear equations, the 
skill you will learn in Chapter 2. 

A great example is the derivative operator: 


Linear 
Transformation 


Example 3 (The derivative operator is linear) 
For any two functions f(x), g(x) and any number c, in calculus you probably learnt 
that the derivative operator satisfies 


1 (cf) =cH#f, 
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2. 4(f+g)= f+ to. 


If we view functions as vectors with addition given by addition of functions and scalar 
multiplication just multiplication of functions by a constant, then these familiar prop- 
erties of derivatives are just the linearity property of linear maps. 


Before introducing matrices, notice that for linear maps L we will often 
write simply Lu instead of L(u). This is because the linearity property of a 
linear transformation L means that L(u) can be thought of as multiplying 
the vector u by the linear operator L. For example, the linearity of L implies 
that if u,v are vectors and c,d are numbers, then 


L(cu + dv) = cLu + dlv, 


which feels a lot like the regular rules of algebra for numbers. Notice though, 
that “uL” makes no sense here. 


Remark A sum of multiples of vectors cu + dv is called a linear combination of 
u and v. 


1.3 What is a Matrix? 


Matrices are linear operators of a certain kind. One way to learn about them 
is by studying systems of linear equations. 


Example 4 A room contains x bags and y boxes of fruit: 


& 4&4 & 
ae ep ae a at 


Each bag contains 2 apples and 4 bananas and each box contains 6 apples and 8 
bananas. There are 20 apples and 28 bananas in the room. Find x and y. 
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The values are the numbers x and y that simultaneously make both of the following 
equations true: 


2x+6y = 20 
4r+8y = 28. 


Here we have an example of a System of Linear Equations. It’s a collection 
of equations in which variables are multiplied by constants and summed, and 
no variables are multiplied together: There are no powers of variables greater 
than one (like x? or b5), non-integer or negative powers of variables (like y~!/? 
or a”), and no places where variables are multiplied together (like ab or ry). 


7) Reading homework: problem 1 


Information about the fruity contents of the room can be stored two ways: 
(i) In terms of the number of apples and bananas. 
(ii) In terms of the number of bags and boxes. 


Intuitively, knowing the information in one form allows you to figure out the 
information in the other form. Going from (ii) to (i) is easy: If you knew there 
were 3 bags and 2 boxes it would be easy to calculate the number of apples 
and bananas, and doing so would have the feel of multiplication (containers 
times fruit per container). In the example above we are required to go the 
other direction, from (i) to (ii). This feels like the opposite of multiplication, 
i.e., division. Matrix notation will make clear what we are “dividing” by. 

The goal of Chapter 2 is to efficiently solve systems of linear equations. 
Partly, this is just a matter of finding a better notation, but one that hints 
at a deeper underlying mathematical structure. For that, we need rules for 
adding and scalar multiplying 2-vectors: 


o(®)\ = (z) and C) y% e = C + n 
y cy y y y +y 
Writing our fruity equations as an equality between 2-vectors and then using 
these rules we have: 


2x +6y= 20 2r+6y\  /20 2 6\ /20 
a = eee — (a) +0(5) = (2 | 
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Now we introduce an operator which takes in 2-vectors and gives out 2- 
vectors. We denote it by an array of numbers called a matrix : 


2 6\. 2 6\/z\ 2 6 
The operator ) is defined by € a (;) i= 2 @ +y (3) . 


A similar definition applies to matrices with different numbers and sizes: 
Example 5 (A bigger matrix) 


eee aes 


Viewed as a machine that inputs and outputs 2-vectors, our 2 x 2 matrix 
does the following: 


x 2x + ôy 
y 4z + 8y) ` 


Our fruity problem is now rather concise: 


03 4 
03 4 
6 2 5 


Exes 


Example 6 (This time in purely mathematical language): 


x i 2 6 x 20 
i = ? 
What vector e satisfies f (5) eal 


This is of the same Lv = w form as our opening examples. The matrix 
encodes fruit per container. The equation is roughly fruit per container 
times number of containers. To solve for fruit we want to somehow “divide” 
by the matrix. 

Another way to think about the above example is to remember the rule 
for multiplying a matrix times a vector. If you have forgotten this, you can 
actually guess a good rule by making sure the matrix equation is the same 
as the system of linear equations. This would require that 


(i a) (o) = Gore) 
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Indeed this is an example of the general rule that you have probably seen 


"eo-ar. 


Notice, that the second way of writing the output on the right hand side of 
this equation is very useful because it tells us what all possible outputs a 
matrix times a vector look like — they are just sums of the columns of the 
matrix multiplied by scalars. The set of all possible outputs of a matrix times 
a vector is called the column space (it is also the image of the linear function 
defined by the matrix). 


ON Reading homework: problem 2 


A matrix is an example of a Linear Transformation, because it takes one 
vector and turns it into another in a “linear” way. Of course, we can have 
much larger matrices if our system has more variables. 


rN Matrices in Space! 


Matrices are linear operators. The statement of this for the matrix in our 
fruity example looks like 


1 2 6 z\ {2 6\ fa 

la BI ly] ~ a 8) Xb 

9 2 6 s\ y ON 2 BTS m 2 6\ fe’ 

"\4 8 y yJ \4 8) \y 4 8} (y 
These equalities can already be verified using only the rules we introduced 
so far. 


Example 7 Verify that (; is a linear operator. 


2 6 ca\ _ 2 Jeb 6 

4 8] \b] “\a) 7% \8 
_ (2ac + 6bc\ _ /2ac + 6be 
~ \ dae 8bc) — Mac + 8bc 


Homogeneity: 


(5) eC) 
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which ought (and does) give the same result as 


ela 8) () ele) t) =e Ga) + C) 
“aa = Cicer ate) 


G SODE 8) Gea) = +9) +6) 
( a 6(b 


which we need to compare to 


(i 8) (3) +(@ 8) @) RORO RAO +48) 
Ot. 


We have come full circle; matrices are just examples of the kinds of linear 
operators that appear in algebra problems like those in section 1.2. Any 
equation of the form Mv = w with M a matrix, and v,w n-vectors is called 
a matrix equation. Chapter 2 is about efficiently solving systems of linear 
equations, or equivalently matrix equations. 


1.4 Review Problems 


You probably have already noticed that understanding sets, functions and 
basic logical operations is a must to do well in linear algebra. Brush up on 
these skills by trying these background webwork problems: 


Logic 1 

Sets 2 
Functions 3 
Equivalence Relations | 4 
Proofs 5 
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Each chapter also has reading and skills WeBWork problems: 
Webwork: | Reading problems | 104, 2 


Probably you will spend most of your time on the review questions: 
1. Problems A, B, and C of example 2 can all be written as Lv = w where 
L:V—W, 


(read this as L maps the set of vectors V to the set of vectors W). For 
each case write down the sets V and W where the vectors v and w 
come from. 


2. Torque is a measure of “rotational force”. It is a vector whose direction 
is the (preferred) axis of rotation. Upon applying a force F on an object 
at point r the torque T is the cross product r x F = T. 


‘ 

s 
‘ 
s 


‘ 
‘ 
‘ 
s 
‘ 
s 
s 
‘ 
‘ 
{J 


Lets find the force F (a vector) must one apply to a wrench lying along 


1 0 
the vector r = | 1 | ft, to produce a torque | 0 | ft lb: 
0 1 
a 
(a) Find a solution by writing out this equation with F = | b 
c 


(Hint: Guess and check that a solution with a = 0 exists). 
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1 
(b) Add | 1 | to your solution and check that the result is a solution. 
0 


(c) Give a physics explanation why there can be two solutions, and 
argue that there are, in fact, infinitely many solutions. 


(d) Set up a system of three linear equations with the three compo- 
nents of F as the variables which describes this situation. What 
happens if you try to solve these equations by substitution? 


3. The function P(t) gives gas prices (in units of dollars per gallon) as a 
function of t the year, and g(t) is the gas consumption rate measured 
in gallons per year by an average driver as a function of their age. 
Assuming a lifetime is 100 years, what function gives the total amount 
spent on gas during the lifetime of an individual born in an arbitrary 
year t? Is the operator that maps g to this function linear? 


4. The differential equation (DE) 


d 


—f=2 
dt j 


says that the rate of change of f is proportional to f. It describes 
exponential growth because 


satisfies the DE for any number f(0). The number 2 in the DE is called 
the constant of proportionality. A similar DE 


has a time-dependent “constant of proportionality” . 


(a) Do you think that the second DE describes exponential growth? 
(b) Write both DEs in the form Df = 0 with D a linear operator. 


5. Pablo is a nutritionist who knows that oranges always have twice as 
much sugar as apples. When considering the sugar intake of schoolchil- 
dren eating a barrel of fruit, he represents the barrel like so: 
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fruit 


, sugar 


Find a linear operator relating Pablo’s representation to the “everyday” 
representation in terms of the number of apples and number of oranges. 
Write your answer as a matrix. 


Hint: Let represent the amount of sugar in each apple. 


An Hint 


. Matrix Multiplication: Let M and N be matrices 


u(t) sG 0) 
-O 


If we first apply N and then M to v we obtain the vector M Nv. 


and v the vector 


(a) Show that the composition of matrices MN is also a linear oper- 
ator. 


(b) Write out the components of the matrix product MN in terms of 
the components of M and the components of N. Hint: use the 
general rule for multiplying a 2-vector by a 2x2 matrix. 
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(c) Try to answer the following common question, “Is there any sense 
in which these rules for matrix multiplication are unavoidable, or 
are they just a notation that could be replaced by some other 
notation?” 


(d) Generalize your multiplication rule to 3 x 3 matrices. 


7. Diagonal matrices: A matrix M can be thought of as an array of num- 
bers m}, known as matrix entries, or matrix components, where 7 and j 
index row and column numbers, respectively. Let 


a= (5 4) =). 


1 2 2 
Compute mi, m3, my and m5. 


The matrix entries mê whose row and column numbers are the same 
are called the diagonal of M. Matrix entries mi with i Æ j are called 
off-diagonal. How many diagonal entries does an n x n matrix have? 
How many off-diagonal entries does an n x n matrix have? 


If all the off-diagonal entries of a matrix vanish, we say that the matrix 
is diagonal. Let 


à 0 , [X 0 
Bat o) and sa ae 


Are these matrices diagonal and why? Use the rule you found in prob- 
lem 6 to compute the matrix products DD’ and D'D. What do you 
observe? Do you think the same property holds for arbitrary matrices? 
What about products where only one of the matrices is diagonal? 


8. Find the linear operator that takes in vectors from n-space and gives 
out vectors from n-space in such a way that whatever you put in, you 
get exactly the same thing out. Show that it is unique. Can you write 
this operator as a matrix? Hint: To show something is unique, it is 
usually best to begin by pretending that it isn’t, and then showing 
that this leads to a nonsensical conclusion. In mathspeak—proof by 
contradiction. 


9. Consider the set S = {*,*,#}. It contains just 3 elements, and has 
no ordering; {*,*,#} = {#,*,*} etc. (In fact the same is true for 
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{1,2,3} = {2,3,1} etc, although we could make this an ordered set 
using 3 > 2 > 1.) 


(i) Invent a function with domain {*,*,#} and codomain R. (Re- 
member that the domain of a function is the set of all its allowed 
inputs and the codomain (or target space) is the set where the 
outputs can live. A function is specified by assigning exactly one 
codomain element to each element of the domain.) 


(ii) Choose an ordering on {x,«,#}, and then use it to write your 
function from part (i) as a triple of numbers. 


(iii) Choose a new ordering on {*, x, #} and then write your function 
from part (i) as a triple of numbers. 


(iv) Your answers for parts (ii) and (iii) are different yet represent the 
same function — explain! 
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2.1 Gaussian Elimination 


Systems of linear equations can be written as matrix equations. Now you 
will learn an efficient algorithm for (maximally) simplifying a system of linear 
equations (or a matrix equation) — Gaussian elimination. 


2.1.1 Augmented Matrix Notation 


Efficiency demands a new notation, called an augmented matrix, which we 
introduce via examples: 
The linear system 
Cae = OF 
l 2r — y 


| 
D 


is denoted by the augmented matrix 


1 127 
2 —1| 0J` 
This notation is simpler than the matrix one, 
1 Ly fey... (%2 
2 —1)}\wy)}) \07?’ 
although all three of the above equations denote the same thing. 
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rN Augmented Matrix Notation 


Another interesting rewriting is 


1 F 1\ (27 
A uina 
This tells us that we are trying to find which combination of the vectors 


1 1 27 - = eal 1 
È and & adds up to ( a the answer is “clearly” 9 (3) +18 E: 


Here is a larger example. The system 


| 
Ke) 


1x + 3y + 2z + 0w 
6x + 2y + 0z — 2w 
—1z + 0y + 1z + 1w 


|l 
w o 


is denoted by the augmented matrix 


FON 
| 
=No 
D 


which is equivalent to the matrix equation 


132 0 
6 2 0 —2 
—1 0 1 1 


Exe R 
|l 
D 


Again, we are trying to find which combination of the columns of the matrix 
adds up to the vector on the right hand side. 

For the the general case of r linear equations in k unknowns, the number 
of equations is the number of rows r in the augmented matrix, and the 
number of columns k in the matrix left of the vertical line is the number of 


unknowns: T ie 
ay a5 ace a% b 

2 2 2 2 
ay a5 ees a% b 

T r r Tr 
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Entries left of the divide carry two indices; subscripts denote column number 
and superscripts row number. We emphasize, the superscripts here do not 
denote exponents. Make sure you can write out the system of equations and 
the associated matrix equation for any augmented matrix. 


"AP Reading homework: problem 1 


We now have three ways of writing the same question. Let’s put them 
side by side as we solve the system by strategically adding and subtracting 
equations. 


Example 8 (How matrix equations and augmented matrices change in elimination) 


A RE 70). 


Replace the first equation by the sum of the two equations: 


aee AOG -a]%), 


Let the new first equation be the old first equation divided by 3: 


mei ys T aa Ao): 


Replace the second equation by the second equation minus two times the first equation: 


AT 4) G)=Cas)* (0 4] a) 


Let the new second equation be the old second equation divided by -1: 


fr VS 9| fl OV (2) Wf 9\ aft 2%) 8 
0+ y = 18 0 1/\y/ \18 0 1] 18) - 


Did you see what the strategy was? To eliminate y from the first equation 
and then eliminate x from the second. The result was the solution to the 
system. 

Here is the big idea: Everywhere in the instructions above we can replace 
the word “equation” with the word “row” and interpret them as telling us 
what to do with the augmented matrix instead of the system of equations. 
Performed systemically, the result is the Gaussian elimination algorithm. 


G+ y 
2x — y 


3x + 0 
2x — y 
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2.1.2 Equivalence and the Act of Solving 


We introduce the symbol ~ which is called “tilde” but should be read as “is 
(row) equivalent to” because at each step the augmented matrix changes by 
an operation on its rows but its solutions do not. For example, we found 


above that 
1 1 | 27 1 0/9 1 0| 9 
2 —1]) 0 2 —1 0 0 1/18) ` 


The last of these augmented matrices is our favorite! 


An Equivalence Example 


Setting up a string of equivalences like this is a means of solving a system 
of linear equations. This is the main idea of section 2.1.3. This next example 
hints at the main trick: 


Example 9 (Using Gaussian elimination to solve a system of linear equations) 


zty = ön ft 15 1 1/5 Le) oe ea 2 
r+2y = 8 1 28 01/3 01/3 O+y = 3 
Note that in going from the first to second augmented matrix, we used the top left 1 
to make the bottom left entry zero. For this reason we call the top left entry a pivot. 
Similarly, to get from the second to third augmented matrix, the bottom right entry 


(before the divide) was used to make the top right one vanish; so the bottom right 
entry is also called a pivot. 


This name pivot is used to indicate the matrix entry used to “zero out” 
the other entries in its column. 


2.1.3 Reduced Row Echelon Form 


For a system of two linear equations, the goal of Gaussian elimination is to 
convert the part of the augmented matrix left of the dividing line into the 
matrix 


called the Identity Matriz, since this would give the simple statement of a 
solution x = a,y = b. The same goes for larger systems of equations for 
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which the identity matrix J has 1’s along its diagonal and all off-diagonal 
entries vanish: 


10. 0 

0 1 0 
[= 

00. 1 


ON Reading homework: problem 2 


For many systems, it is not possible to reach the identity in the augmented 
Example 10 (Redundant equations) 


matrix via Gaussian elimination: 
r +y =2 1 112 1 Tio £ +y 
© ~ => 
2x + 2y = 4 2 2/4 0 010 0 + 0 
This example demonstrates if one equation is a multiple of the other the identity matrix 
can not be a reached. This is because the first step in elimination will make the second 


row a row of zeros. Notice that solutions still exists = 1,y = 1 is a solution. The 
last augmented matrix here is in RREF. 


2 
0 


Example 11 (Inconsistent equations) 


r + y =2 1 1]2 1 1)2 zr + y=2 
© ~ => 

2r + 2y = 5 2 2|5 0 O}1 0+ 0 = 1 

This system of equation has a solution if there exists two numbers x, and y such that 


0+0= 1. That is a tricky way of saying there are no solutions. The last form of the 
augmented matrix here is in RREF. 


Example 12 (Silly order of equations) 
A robot might make this mistake: 


Ox + y= -2 0 1 
> 
xe +ye= 7 1 1 
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and then give up because the the upper left slot can not function as a pivot since the 0 
that lives there can not be used to eliminate the zero below it. Of course, the right 
thing to do is to change the order of the equations before starting 


os + y = 7 11| 7 1 0| 9 zr +0= 9 
© ~x => 
Or + y = -2 0 1|-2 0 1l-2 0 +y = -2, 


The third augmented matrix above is the RREF of the first and second. That is to 
say, you can swap rows on your way to RREF. 


For larger systems of matrices, these three kinds of problems are the 
obstruction to obtaining the identity matrix, and hence to a simple statement 
of a solution in the form z = a,y = b,... . What can we do to maximally 
simplify a system of equations in general? We need to perform operations 
that simplify our system without changing its solutions. Because, exchanging 
the order of equations, multiplying one equation by a non-zero constant or 
adding equations does not change the system’s solutions, we are lead to three 
operations: 


e (Row Swap) Exchange any two rows. 
e (Scalar Multiplication) Multiply any row by a non-zero constant. 
e (Row Sum) Add a multiple of one row to another row. 


These are called Elementary Row Operations, or EROs for short, and are 
studied in detail in section 2.3. Suppose now we have a general augmented 
matrix for which the first entry in the first row does not vanish. Then, using 
just the three EROs, we could then perform the following algorithm: 


e Make the leftmost nonzero entry in the top row 1 by multiplication. 
e Then use that 1 as a pivot to eliminate everything below it. 

e Then go to the next row and make the leftmost non zero entry 1. 

e Use that 1 as a pivot to eliminate everything below and above it! 


e Go to the next row and make the leftmost nonzero entry 1... etc 


In the case that the first entry of the first row is zero, we may first interchange 
the first row with another row whose first entry is non-vanishing and then 
perform the above algorithm. If the entire first column vanishes, we may still 
apply the algorithm on the remaining columns. 
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rN Beginner Elimination 


This algorithm is known as Gaussian elimination, its endpoint is an aug- 
mented matrix of the form 


1 *« O x« O--- 0 x] bv 
001 *« 0 --- 0 «| 6 
00001 --. 0 «| 6 
000 0 0 1 «| oF 
000 0 0 het 
000 00 -:-- 0 0} © 


This is called Reduced Row Echelon Form (RREF). The asterisks denote 
the possibility of arbitrary numbers (e.g., the second 1 in the top line of 
example 10). The following properties define RREF: 


1. In every row the left most non-zero entry is 1 (and is called a pivot). 


2. The pivot of any given row is always to the right of the pivot of the 
row above it. 


3. The pivot is the only non-zero entry in its column. 
Here are some examples: 


Example 13 (Augmented matrix in RREF) 


1 0 7/0 
0 1 30 
0 0 Oj; 1 
0 0 0/0 
Example 14 (Augmented matrix NOT in RREF) 
1 0 3/0 
0 0 2)]0 
O 1 Oj; 1 
0 0 Oj; 1 


Actually, this NON-example breaks all three of the rules! 
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The reason we need the asterisks in the general form of RREF is that 
not every column need have a pivot, as demonstrated in examples 10 and 13. 
Here is an example where multiple columns have no pivot: 


Example 15 (Consecutive columns with no pivot in RREF) 
x y 4 0w = 2 PN 1 1 1 02 1 1 102 
2x 2y 2z 2w = 4 2 2 2 1/4 0 0 0O 1/0 


ear 


| 
© 


me w = O. 


Note that there was no hope of reaching the identity matrix, because of the shape of 
the augmented matrix we started with. 


An Advanced Elimination 


It is important that you are able to convert RREF back into a set of 
equations. The first thing you might notice is that if any of the numbers 
bk+1 |b" are non-zero then the system of equations is inconsistent and has 
no solutions. Our next task is to extract all possible solutions from an RREF 
augmented matrix. 


2.1.4 Solutions and RREF 
RREF is a maximally simplified version of the original system of equations 
in the following sense: 


e As many coefficients as possible of the variables vanish. 


e As many coefficients as possible of the variables is unity. 


It is easier to read off solutions from the maximally simplified equations than 
from the original equations, even when there are infinitely many solutions. 


Example 16 
z + y + 5w = 1 1 1 0 5)1 1 0 0 3—5 
y + 2w = 6 = {0 10 2/6]~ 40 1 0 2) 6 
z + 4w = 8 001 4/8 001 4| 8 
g + 3w = -5 
= y + Ww = 
z + 4w = 8 
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In this case, we say that x,y, and z are pivot variables because they appear with a 
pivot coefficient in RREF. Since w never appears with a pivot coefficient, it is not a 
pivot variable. One way to express the solutions to this system of equations is to put 
all the pivot variables on one side and all the non-pivot variables on the other side. 
It is also nice to add the “empty equation” w = w to obtain the system 


xe = —5 — Ww x —5 —3 
y= 6 = 22w yl _ 6 —2 
z = 8 — Aw a z| 8 oe —4 |’ 
w = w w 0 1 


which we have written as the solution to the corresponding matrix problem. There are 
infinitely many solutions, one for each value of z. We call the collection of all solutions 
the solution set. A good check is to set w = 0 and see if the system is solved. 


The last example demonstrated the standard approach for solving a sys- 
tem of linear equations in its entirety: 


1. Write the augmented matrix. 
2. Perform EROs to reach RREF. 
3. Express the non-pivot variables in terms of the pivot variables. 


There are always exactly enough non-pivot variables to index your solutions. 
In any approach, the variables which are not expressed in terms of the other 
variables are called free variables. The standard approach is to use the non- 
pivot variables as free variables. 

Non-standard approach: solve for w in terms of z and substitute into the 
other equations. You now have an expression for each component in terms 
of z. But why pick z instead of y or x? (or x+y?) The standard approach 
not only feels natural, but is canonical, meaning that everyone will get the 
same RREF and hence choose the same variables to be free. However, it is 
important to remember that so long as their set of solutions is the same, any 
two choices of free variables is fine. (You might think of this as the difference 
between using Google Maps™ or Mapquest™); although their maps may 
look different, the place (home sic) they are describing is the same!) 

When you see an RREF augmented matrix with two columns that have 
no pivot, you know there will be two free variables. 


37 


38 


Systems of Linear Equations 


Example 17 
107 0/4 
013 411 TA x +7z =4 
0 0 0 0/0 y +3z+4w =1 
0 0 0 0/0 


Expressing the pivot variables in terms of the non-pivot variables, and using two empty 
equations gives 


r =4— z x 4 —7 0 
y = 1 — 3z — 4w pre ca 1 shes —3 Ew —4 
2 = z z 0 1 0 
w = w w 0 0 1 


There are infinitely many solutions; one for each pair of numbers z, w. 


rN Solution set in set notation 


You can imagine having three, four, or fifty-six non-pivot columns and 
the same number of free variables indexing your solutions set. You need to 
become very adept at reading off solutions of linear systems from the RREF 
of their augmented matrix. 


rN Worked examples of Gaussian elimination 


2.2 Review Problems 


Reading problems Lee, 264 
Augmented matrix 6 
Webwork: 2 x 2 systems 1, 8, 9, 10, 11, 12 
3 x 2 systems 13, 14 
3 x 3 systems 15. 16, 17 


1. State whether the following augmented matrices are in RREF and com- 
pute their solution sets. 


oo Cc Fe 
Cor © 
Coro & 
=. OO Oo 
Or Fw 
oOWN FR 
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1 1010 10 
0012 0 2]0 
0000 1 3/07’ 
0000000 
1 101010 1 
0 0 1202 0|—1 
0000 130fj 1 
0 0 0 0 0 2 0j|-—2 
000000 1| 1 


2. Solve the following linear system: 


2%, + 5£2 — 8x3 + 2z4 + 275 =0 
6x1 + 2£2 —10£3 + 6x4 + 8x5 = 6 
321 + 6£2 + 2434+ 3x4 + 5x5 = 6 
341 + lx. — 5£3 + 3x4 + 405 = 3 


621 T 7X9 — 323 6x4 9x5 =9 


Be sure to set your work out carefully with equivalence signs ~ between 
each step, labeled by the row operations you performed. 


3. Check that the following two matrices are row-equivalent: 
1 4 7/10 q 0 —1 8/20 
M E NA E | Oy) 
Now remove the third column from each matrix, and show that the 
resulting two matrices (shown below) are row-equivalent: 


TAN a (0 -1)20 
2 9| 0) PPE 4 18| 0/` 


Now remove the fourth column from each of the original two matri- 
ces, and show that the resulting two matrices, viewed as augmented 
matrices (shown below) are row-equivalent: 


1 417 à 0 -1] 8 
DG |G NA. BUDE 
Explain why row-equivalence is never affected by removing columns. 
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. Check that the system of equations corresponding to the augmented 


matrix 
1 4/10 
3.13] 9 
4 17] 20 


has no solutions. If you remove one of the rows of this matrix, does 
the new matrix have any solutions? In general, can row equivalence be 
affected by removing rows? Explain why or why not. 


. Explain why the linear system has no solutions: 


1 0 3}1 
0 1 2/4 
0 0 0/6 


For which values of k does the system below have a solution? 


xu — 3y = 6 
x + 3z = —3 
2x + ky + (8-k)z = 1 


AN Hint 


. Show that the RREF of a matrix is unique. (Hint: Consider what 


happens if the same augmented matrix had two different RREF's. Try 
to see what happens if you removed columns from these two RREF 
augmented matrices. ) 


. Another method for solving linear systems is to use row operations to 


bring the augmented matrix to Row Echelon Form (REF as opposed to 
RREF). In REF, the pivots are not necessarily set to one, and we only 
require that all entries left of the pivots are zero, not necessarily entries 
above a pivot. Provide a counterexample to show that row echelon form 
is not unique. 


Once a system is in row echelon form, it can be solved by “back substi- 
tution.” Write the following row echelon matrix as a system of equa- 
tions, then solve the system using back-substitution. 
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8. 


10. 


2 3 1/6 
0 1 112 
0 0 313 
Show that this pair of augmented matrices are row equivalent, assuming 


ad — bc # 0: 
de—b 
(' b Jl 0 ) 
c dif 0 1| soe 


Consider the augmented matrix: 


2 -1)3 
—6 3]1)- 
Give a geometric reason why the associated system of equations has 


no solution. (Hint, plot the three vectors given by the columns of this 
augmented matrix in the plane.) Given a general augmented matrix 


a ble 

c di fj)’ 
can you find a condition on the numbers a, b, c and d that corresponding 
to the geometric condition you found? 


A relation ~ on a set of objects U is an equivalence relation if the 
following three properties are satisfied: 

e Reflexive: For any x € U, we have z ~ z. 

e Symmetric: For any x,y € U, if x ~ y then y ~ zx. 

e Transitive: For any x,y and z € U, if x ~ y and y ~ z then x ~ z. 
Show that row equivalence of matrices is an example of an equivalence 
relation. 


(For a discussion of equivalence relations, see Homework 0, Problem 4) 


An Hint 
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11. Equivalence of augmented matrices does not come from equality of their 
solution sets. Rather, we define two matrices to be equivalent one can 
be obtained from the other by elementary row operations. 


Find a pair of augmented matrices that are not row equivalent but do 


have the same solution set. 


2.3 Elementary Row Operations 


Elementary row operations are systems of linear equations relating the old 
and new rows in Gaussian elimination: 


Example 18 (Keeping track of EROs with equations between rows) 


We refer to the new kth row as R}, and the old kth row as Rg. 


Ri =0Ri+ R2+0R3 
R= Rı+0R2+0R3 
0 1 1]7\ Reor+or+R, 0d ' 0 1 0\ /R 
20 0/4 iS gi aly '|={1 0 0) [Re 
oa dila 00 114 A oo A 
Ri = 4 R,+0R2+0R3 
R5=0Ri+ R2+0R3 
RL=0R\+0R2+ Ry f1 0 2 1 z 0 0\ (Ri 
0 0 114 3 00 17 \Rs 
Ri = Rı+0R2+0R3 
R į=0Rı+ R2- R3 i 
R5=0R1+0R2+ R3 1 0 2 1 1 0 0 Ry 
~ 0 1 3 Roy = )0. 1 -1 Ro 
00 1/4 : 00 1) (B 


On the right, we have listed the relations between old and new rows in matrix notation. 


ees Reading homework: problem 3 


2.3.1 ERQOs and Matrices 


The matrix describing the system of equations relating rows performs the 
corresponding ERO on the augmented matrix: 
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Example 19 (Performing EROs with Matrices) 


01 0\ fo 1 1|7 2 4 
10 0) {2 0 0]/4] = [0 1 1)7 
00 1/ \o 0 14/4 00 114 
2 
5 0 0\ /2 0 0]4 1 2 
OL O,(0 2 1/7) = (0 1 1)% 
00 1/ \0 0 14/4 00 1]4 
2 
10 0\ /1 0 0/2 1 0 0]2 
0 1-1]{0 1 1/7) = {0 1 0/3 
00 1/ \o 0 1/4 00 14 


Here we have multiplied the augmented matrix with the matrices that acted on rows 
listed on the right of example 18. 


Realizing EROs as matrices allows us to give a concrete notion of “di- 
viding by a matrix”; we can now perform manipulations on both sides of an 
equation in a familiar way: 


Example 20 (Undoing A in Ax = b slowly, for A = 6 = 3- 2) 


6r = 12 
& 3-\6¢ = 37112 
< 2a = 4 
& i2 = 2714 
< lg = 2 


The matrices corresponding to EROs undo a matrix step by step. 
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Example 21 (Undoing A in Ax = b slowly, for A = M =...) 
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This is another way of thinking about Gaussian elimination which feels more 
like elementary algebra in the sense that you “do something to both sides of 
an equation” until you have a solution. 


2.3.2 Recording EROs in (M|I) 


Just as we put together 371271 = 67} to get a single thing to apply to both 
sides of 6x = 12 to undo 6, we should put together multiple EROs to get 
a single thing that undoes our matrix. To do this, augment by the identity 
matrix (not just a single column) and then perform Gaussian elimination. 
There is no need to write the EROs as systems of equations or as matrices 
while doing this. 
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Example 22 (Collecting EROs that undo a matrix) 
0 1 1;1 0 0 2 0 0;0 1 O 
2 0 0;0 1 0 ~ 0 1 1|1 0 0 
0 0O 1/0 0 1 0 0O 1/0 0 1 


O Onie 
O ODN 


1 0 0O 0 1 0 0O 0 
~ O 1 11 O};r~y Oil 01 —1 : 
0 0 1/0 1 0 0 1/0 1 


As we changed the left side from the matrix M to the identity matrix, the 
right side changed from the identity matrix to the matrix which undoes M: 


Example 23 (Checking that one matrix undoes another) 


05 0 0 11 10 0 
1 0-1 200ļ]=ļ|010 
00 1 001 001 


If the matrices are composed in the opposite order, the result is the same. 


0 11 04 0 100 
2 0 0 1 0—1 }={[0 10 
00 1 00 1 0 0 1 


Whenever the product of two matrices MN = I, we say that N is the 
inverse of M or N = M~! and conversely M is the inverse of N or M = N71. 

In abstract generality, let M be some matrix and, as always, let Z stand 
for the identity matrix. Imagine the process of performing elementary row 
operations to bring M to the identity matrix: 


(MI) ~ (E1M|E1) ~ (E2E1M|E2E1) ~ +++ ~ M| EE). 


The ellipses “---” stand for additional EROs. The result is a product of 
matrices that form a matrix which undoes M 


-- Koki M=T. 


This is only true if the RREF of M is the identity matrix. In that case, we 
say M is invertible. 

Much use is made of the fact that invertible matrices can be undone with 
EROs. To begin with, since each elementary row operation has an inverse, 


MS hs 38% 
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while the inverse of M is 
it =... BE). 
This is symbolically verified by 
MM 333 EzE, Boe ss = oeo By By a SS ll. 

Thus, if M is invertible, then M can be expressed as the product of EROs. 
(The same is true for its inverse.) This has the feel of the fundamental 
theorem of arithmetic (integers can be expressed as the product of primes) 
or the fundamental theorem of algebra (polynomials can be expressed as the 


product of [complex] first order polynomials); EROs are building blocks of 
invertible matrices. 


2.3.3 The Three Elementary Matrices 


We now work toward concrete examples and applications. It is surprisingly 
easy to translate between EROs and matrices that perform EROs. The 
matrices corresponding to these kinds are close in form to the identity matrix: 


e Row Swap: Identity matrix with two rows swapped. 
e Scalar Multiplication: Identity matrix with one diagonal entry not 1. 
e Row Sum: The identity matrix with one off-diagonal entry not 0. 


Example 24 (Correspondences between EROs and their matrices) 


e The row swap matrix that swaps the 2nd and 4th row is the identity matrix with 
the 2nd and 4th row swapped: 


O o eo Oo 
orf o.oo 6 
oo — oOo © 
ooor © 
h DOG © 


e The scalar multiplication matrix that replaces the 3rd row with 7 times the 3rd 
row is the identity matrix with 7 in the 3rd row instead of 1: 


1 0 0 


HOG OG 


1 0 
0 7 
0 0 


oo: 
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e The row sum matrix that replaces the 4th row with the 4th row plus 9 times 
the 2nd row is the identity matrix with a 9 in the 4th row, 2nd column: 


100 00 0 0 
010 0 0 0 0 
001 0 0 0 0 
09 010 0 0 
000 0 1 0 0 
000 0 0 1 0 
000000 1 


We can write an explicit factorization of a matrix into EROs by keeping 
track of the EROs used in getting to RREF. 


Example 25 (Express W from Example 22 as a product of EROs) 
Note that in the previous example one of each of the kinds of EROs is used, in the 
order just given. Elimination looked like 


011 200 100\ /10 0 
M=| 2 ee ee ae ee ee ee ee 
001 001 001 001 

where the EROs matrices are 
0 1 0 5 0 0 10 0 
E,;=[{ 100), Hm={010),2=] 01 -1 
001 001 00 1 


The inverse of the ERO matrices (corresponding to the description of the reverse row 
maniplulations) 


0 1 0 2 0 0 10 0 
E=|1ı 0 Oi, eS) Oe Oo eat Oe 
001 001 001 
Multiplying these gives 
01 0 0 0 100 
Er B; By = 1 0 010 011 
0 1 0 1 001 
1 0 2 0 0 0 11 
= |10 01 1jļ=|200|=M 
0 1 0 1 001 
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2.3.4 LU, LDU, and LDPU Factorizations 


The process of elimination can be stopped halfway to obtain decompositions 
frequently used in large computations in sciences and engineering. The first 
half of the elimination process is to eliminate entries below the diagonal. 
leaving a matrix which is called upper triangular. The elementary matrices 
which perform this part of the elimination are lower triangular, as are their 
inverses. But putting together the upper triangular and lower triangular 
parts one obtains the so-called LU factorization. 


Example 26 (LU factorization) 


2 0 —3 1 2 0 -3 1 
WO: & 2 2) ee |0 1 2 2 
M= —4 0 9 2 0 0 3 4 
0 -1 1 —1 0 —1 1 —1 
2 0 -3 1 2 0 -3 1 
E2 0 1 2 2 E; |0 1 2 2 -U 
0 0 3 4 0 0 3 4p? 
00 31 00 0 -3 
where the EROs and their inverses are 
1000 1000 1 0 0 0 
0100 0100 0 1 0 0 
Hst O|2 22° loco rah A lone. 4-6 
0001 0101 0 0 —1 1 
1000 1 0 0 0 1 0 0 0 
a |o0o1ı00| „ [lo 100| cag l0100 
PT —2 0 1 0 pia = 0 0 1 0 » Bs -10010 
0 0 0 1 0 —1 0 1 0 0 1 1 


Applying inverse elementary matrices to both sides of the equality U = E3 E361 M 
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= 2,8, BU. Ot 


gives M 


This is a lower triangular matrix times an upper triangular matrix. 
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What if we stop at a different point in elimination? We could multiply 
rows so that the entries in the diagonal are 1 next. Note that the EROs that 
do this are diagonal. This gives a slightly different factorization. 


Example 27 (LDU factorization building from previous example) 


2 0-3 1 20-3 1 103 4 
m-| ® 1 2 2| mmm |01 2 2)y/01 2 2 
 |-4 0 9 2 00 3 4 00 3 4 
0-1 1 -1 00 0 -3 00 0 -3 
-3 1 —3 1 
103 3 10 = 3 
Bs 01 2 2|æ|01 2 2|_y 
00 1 5 00 1 4| 
00 0 -3 00 01 
The corresponding elementary matrices are 
5 000 1000 100 0 
0100 0100 010 0 
Maya te ee 0. 05 Ee | a of? 
0001 0001 000 -§ 
2000 1000 100 0 
a _ {01 00| ee. MOT: SOO sears |010 0 
“Sipo rg e mensa e E OO a 20") 
0001 0001 000 -3 


The equation U = Eg E5E,E3E2E,M can be rearranged as 


M = (E7 By Ey (Ey Be Eg VW. 


We calculated the product of the first three factors in the previous example; it was 
named L there, and we will reuse that name here. The product of the next three 
factors is diagonal and we wil name it D. The last factor we named U (the name means 
something different in this example than the last example.) The LDU factorization 
of our matrix is 


2 0-3 1 1 00 0\ /2 00 \ /1 0-3 § 
Oy de, 2 OB) Oe OOO] PO Oo oļjj01 22 
—4 0 9 2|  {-2 0 1 0f/{0 03 Of {oo 1 § 
0-1 1 -1 0-1 1 1/ \o 0 1 -3/ \o 0 0 1 
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The LDU factorization of a matrix is a factorization into blocks of EROs 
of a various types: L is the product of the inverses of EROs which eliminate 
below the diagonal by row addition, D the product of inverses of EROs which 
set the diagonal elements to 1 by row multiplication, and U is the product 
of inverses of EROs which eliminate above the diagonal by row addition. 

You may notice that one of the three kinds of row operation is missing 
from this story. Row exchange may be necessary to obtain RREF. Indeed, so 
far in this chapter we have been working under the tacit assumption that M 
can be brought to the identity by just row multiplication and row addition. 
If row exchange is necessary, the resulting factorization is LDPU where P is 
the product of inverses of EROs that perform row exchange. 


Example 28 (LDPU factorization, building from previous examples) 


0 1 2 2 0 1 2 2 
O 2 0 -3 1| Æ 2 0 —3 1 | Es Es E1E3E2E1 
dae 9 2 —4 0 9 2 i 
0 —1 1 -1 0 -1 1 -1 
0100 
(1000| „a 
aSa a n 
0001 
M = (E7 ' E, ' E3 ')(E7'E5'Eg')(E7')U = LDPU 
01 2 2 1 0 0 0\/2 0 0 O\/0 1 0 O\/1 0 -3 § 
2 0-3 1] | 0 10 O]f0 1 0 Off1 0 0 Ojfo 1 2 2 
-4 0 9 2| [-2 0 1 Off0 0 3 Offo 0 1 Offo o 1 4 
0-1 1-1 0-1 1 1/\0 0 1-3/\0 0 0 1/\0 0 0 1 


2.4 Review Problems 


Reading problems | 3&7% 
Webwork: | Matrix notation 18 
LU 19 


51 


52 


Systems of Linear Equations 


. While performing Gaussian elimination on these augmented matrices 


write the full system of equations describing the new rows in terms of 
the old rows above each equivalence symbol as in example 18. 


satii 11 0| 5 
col wey Cao 
ales 


. Solve the vector equation by applying ERO matrices to each side of 


the equation to perform elimination. Show each matrix explicitly as in 
example 21. 


3 6 2 /z 
59 4) {fy]=[ 1 
24 2} \z 


. Solve this vector equation by finding the inverse of the matrix through 


(M|I) ~ (1|M~*) and then applying M~! to both sides of the equation. 


21 1\ /z 9 
11i1}/y)=({6 
112) \z 7 


. Follow the method of examples 26 and 27 to find the LU and LDU 


factorization of 


3 3 6 
3.5 2 
6 2 5 


. Multiple matrix equations with the same matrix can be solved simul- 


taneously. 


(a) Solve both systems by performing elimination on just one aug- 
mented matrix. 


2 =l =] T 0 2 =] =] a 2 
=] 1 1 y|ļ=ļ|1],{—1 1 1 oe en 
1 —1 0 z 0 1 —1 0 c 1 
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(b) What are the columns of M~! in (M|I) ~ (I|M~')? 


6. How can you convince your fellow students to never make this mistake? 


Ri =Ri+Re2 
PO (3) Hee ya a6 
eee ae AEs 
2 0 1]4 1 2 6/9 


7. Is LU factorization of a matrix unique? Justify your answer. 


oo. If you randomly create a matrix by picking numbers out of the blue, 
it will probably be difficult to perform elimination or factorization; 
fractions and large numbers will probably be involved. To invent simple 
problems it is better to start with a simple answer: 


(a) Start with any augmented matrix in RREF. Perform EROs to 
make most of the components non-zero. Write the result on a 
separate piece of paper and give it to your friend. Ask that friend 
to find RREF of the augmented matrix you gave them. Make sure 
they get the same augmented matrix you started with. 


(b) Create an upper triangular matrix U and a lower triangular ma- 
trix L with only 1s on the diagonal. Give the result to a friend to 
factor into LU form. 


(c) Do the same with an LDU factorization. 


2.5 Solution Sets for Systems of Linear Equa- 
tions 


Algebra problems can have multiple solutions. For example x(a — 1) = 0 has 
two solutions: 0 and 1. By contrast, equations of the form Ax = b with Aa 
linear operator (with scalars the real numbers) have the following property: 


If A is a linear operator and b is known, then Ax = b has either 


1. One solution 
2. No solutions 


3. Infinitely many solutions 
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2.5.1 The Geometry of Solution Sets: Hyperplanes 


Consider the following algebra problems and their solutions 
1. 6x = 12, one solution: 2 
2. Ox = 12, no solution 
3. Ox = 0, one solution for each number: x 


In each case the linear operator is a 1 x 1 matrix. In the first case, the linear 
operator is invertible. In the other two cases it is not. In the first case, the 
solution set is a point on the number line, in the third case the solution set 
is the whole number line. 

Lets examine similar situations with larger matrices. 


1. J (5) = e one solution: (5) 
0 2) \y 6 3 
2: G } (5) = Gr no solutions 
3. G à G = Gr one solution for each number y: ( Ris ) 
4. o j) (5) = Gr one solution for each pair of numbers zx, y: (5) 


Again, in the first case the linear operator is invertible while in the other 
cases it is not. When the operator is not invertible the solution set can be 
empty, a line in the plane or the plane itself. 

For a system of equations with r equations and k veriables, one can have a 
number of different outcomes. For example, consider the case of r equations 
in three variables. Each of these equations is the equation of a plane in three- 
dimensional space. To find solutions to the system of equations, we look for 
the common intersection of the planes (if an intersection exists). Here we 
have five different possibilities: 


1. Unique Solution. The planes have a unique point of intersection. 


2. No solutions. Some of the equations are contradictory, so no solutions 
exist. 
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3. Line. The planes intersect in a common line; any point on that line 
then gives a solution to the system of equations. 


4. Plane. Perhaps you only had one equation to begin with, or else all 
of the equations coincide geometrically. In this case, you have a plane 
of solutions, with two free parameters. 


An Planes 


5. All of R®. If you start with no information, then any point in R3 is a 
solution. There are three free parameters. 


In general, for systems of equations with k unknowns, there are k + 2 
possible outcomes, corresponding to the possible numbers (i.e., 0,1,2,...,k) 
of free parameters in the solutions set, plus the possibility of no solutions. 
These types of “solution sets” are “hyperplanes”, generalizations of planes 
that behave like planes in R? in many ways. 


a Reading homework: problem 4 


@ Pictures and Explanation 


2.5.2 Particular Solution + Homogeneous Solutions 


In the standard approach, variables corresponding to columns that do not 
contain a pivot (after going to reduced row echelon form) are free. We called 
them non-pivot variables. They index elements of the solution set by acting 
as coefficients of vectors. 


Example 29 (Non-pivot columns determine terms of the solutions) 


10 Ways 1 la, +029 +1z3— 1z4 =1 

o 1 -1 1)? |= 1) 62 02,4129 -—1923+124 =1 

0 0 ii 0 Or, + Oro +023+024 =0 
4 
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Following the standard approach, express the pivot variables in terms of the non-pivot 
variables and add “empty equations”. Here x3 and x4 are non-pivot variables. 


zı = l— z3 + z4 Ly 1 —1 
tw = 1+ z3- z4 t2 | 1 1 — 
T3 = T3 s T3 7 0 Tas 1 aa 0 
T4 = T4 £4 0 0 1 
The preferred way to write a solution set is with set notation. 
Ly 1 —1 1 
T2 1 1 — 
= = $ R 
S ie o| T7 1| 7 #2 o | Hi H2 E 
T4 0 0 1 


Notice that the first two components of the second two terms come from the non-pivot 
columns. Another way to write the solution set is 


S = {Xo + mY + u2Y2 : 1,2 E R}, 


where 


i= 


CoO Fe 


Here Xo is called a particular solution while Y, and Yo are called homogeneous 
solutions. 


2.5.3 Solutions and Linearity 


Motivated by example 29, we say that the matrix equation MX = V has 
solution set {Xo + M1Y1 + u2Yə | M1, 42 E€ R}. Recall that matrices are linear 
operators. Thus 


M(Xo + mYi + peY2) = M Xo + m1 MY + MY =V, 


for any u1, u2 € R. Choosing uı = u2 = 0, we obtain 


MXo=V. 
This is why Xo is an example of a particular solution. 
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Setting uı = 1, u2 = 0, and using the particular solution MXo = V, we 
obtain 
MY, =0. 


Likewise, setting 4; = 0, 2 = 1, we obtain 
MY, =0. 


Here Yı and Y, are examples of what are called homogeneous solutions to 
the system. They do not solve the original equation MX = V, but instead 
its associated homogeneous equation MY = 0. 

We have just learnt a fundamental lesson of linear algebra: the solution 
set to Ax = b, where A is a linear operator, consists of a particular solution 
plus homogeneous solutions. 


{Solutions} = {Particular solution + Homogeneous solutions} 


Example 30 Consider the matrix equation of example 29. It has solution set 


1 —1 1 
S = ; +p i + p2 7 |u, u2 ER 
0 0 1 
Ly 1 
Then M Xo = V says that 7 = : solves the original matrix equation, which 
LA 0 


is certainly true, but this is not the only solution. 


x : 
MY; = 0 says that as solves the homogeneous equation. 


—1 
1 
1 
T4 0 
1 
£ —1 ‘ 
MY» = 0 says that a be 0 solves the homogeneous equation. 
T4 1 
Notice how adding any multiple of a homogeneous solution to the particular solution 
yields another particular solution. 
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<< Reading homework: problem 2.5 


2.6 Review Problems 


Webwork: | Solution sets 20, 21; 22 


Reading problems 46, 5R 


Geometry of solutions | 23, 24, 25, 26 


1. Write down examples of augmented matrices corresponding to each 


of the five types of solution sets for systems of equations with three 
unknowns. 


. Invent a simple linear system. Use the standard approach for solving 
linear systems and a non-standard approach to obtain different descrip- 
tions of the solution set which have different particular solutions. 


. Let f(X) = MX where 


1 0 1-1 “i yı 
M={01-1 1| ad= Yal” 
00 0 0 ie y3 

T4 YA 


Suppose that a is any number. Compute the following four quantities: 
aX, f(X), af(X) and f(aXx). 
Check your work by verifying that 
af(X) = flax), andf(X+Y) = f(X)+ fY) . 
Now explain why your results for f(aX) and f(X +Y) together imply 
FOX + BY) =af(X)+6f(Y). 


(Be sure to state which values of the scalars a and 8 are allowed.) 


4. Let 


1 1 1 1 

ay a5 aids a% T 
M=]. . , and X = 

r r r k 

ay a5 7 ak T 
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Note: x? does not denote the square of x. Instead xt, x”, x3, etc..., 
denote different variables; the superscript is an index. Although confus- 
ing at first, this notation was invented by Albert Einstein who noticed 
that quantities like a?x! + a3x?---+a2ax* =: a a‘a), can be written 
unambiguously as ara), This is called Einstein summation notation. 
The most important thing to remember is that the index 7 is a dummy 
variable, so that afz’ = az‘; this is called “relabeling dummy indices”. 
When dealing with products of sums, you must remember to introduce 
a new dummy for each term; i.e., ajx'bjy’ = $ a;x'byy’ does not equal 
ajz’ jy = (2 d) ( D bjy?). 

Use Einstein summation notation to propose a rule for MX so that 
MX = 0 is equivalent to the linear system 


alx! +ahz?---+atr* = 0 


azz! +a3z?---+azr* = 0 


aïs! +ahx?---+atr* = 0 


Show that your rule for multiplying a matrix by a vector obeys the 
linearity property. 


. The standard basis vector e; is a column vector with a one in the ith 
row, and zeroes everywhere else. Using the rule for multiplying a matrix 
times a vector in problem 4, find a simple rule for multiplying Me;, 
where M is the general matrix defined there. 


. If Ais a non-linear operator, can the solutions to Ax = b still be written 
as “general solution=particular solution + homogeneous solutions”? 
Provide examples. 


. Find a system of equations whose solution set is the walls of al x 1x1 
cube. (Hint: You may need to restrict the ranges of the variables; could 
your equations be linear?) 
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The Simplex Method 


In Chapter 2, you learned how to handle systems of linear equations. However 
there are many situations in which inequalities appear instead of equalities. 
In such cases we are often interested in an optimal solution extremizing a 
particular quantity of interest. Questions like this are a focus of fields such as 
mathematical optimization and operations research. For the case where the 
functions involved are linear, these problems go under the title linear pro- 
gramming. Originally these ideas were driven by military applications, but 
by now are ubiquitous in science and industry. Gigantic computers are dedi- 
cated to implementing linear programming methods such as George Dantzig’s 
simplex algorithm-—the topic of this chapter. 


3.1 Pablo’s Problem 


Let us begin with an example. Consider again Pablo the nutritionist of 
problem 5, chapter 1. The Conundrum City school board has employed 
Pablo to design their school lunch program. Unfortunately for Pablo, their 
requirements are rather tricky: 


Example 31 (Pablo's problem) 

The Conundrum City school board is heavily influenced by the local fruit grower’s 
association. They have stipulated that children eat at least 7 oranges and 5 apples 
per week. Parents and teachers have agreed that eating at least 15 pieces of fruit per 
week is a good thing, but school janitors argue that too much fruit makes a terrible 
mess, so that children should eat no more than 25 pieces of fruit per week. 
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(0) 
oo JO 050 
To 


Finally Pablo knows that oranges have twice as much sugar as apples and that apples 
have 5 grams of sugar each. Too much sugar is unhealthy, so Pablo wants to keep the 
children’s sugar intake as low as possible. How many oranges and apples should Pablo 
suggest that the school board put on the menu? 


This is a rather gnarly word problem. Our first step is to restate it as 
mathematics, stripping away all the extraneous information: 


Example 32 (Pablo's problem restated) 
Let x be the number of apples and y be the number of oranges. These must obey 


z>5 and y>7, 


to fulfill the school board’s politically motivated wishes. The teacher’s and parent’s 
fruit requirement means that 


xr+y2 15, 
but to keep the canteen tidy 
erty <2. 
Now let 
s = 5x + 10y. 


This linear function of (x, y) represents the grams of sugar in x apples and y oranges. 
The problem is asking us to minimize s subject to the four linear inequalities listed 
above. 
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3.2 Graphical Solutions 


Before giving a more general algorithm for handling this problem and prob- 
lems like it, we note that when the number of variables is small (preferably 2), 
a graphical technique can be used. 

Inequalities, such as the four given in Pablo’s problem, are often called 
constraints, and values of the variables that satisfy these constraints comprise 
the so-called feasible region. Since there are only two variables, this is easy 
to plot: 


Example 33 (Constraints and feasible region) Pablo's constraints are 


xr>5d 
yet 


15<z+y< 25. 


Plotted in the (x,y) plane, this gives: 


You might be able to see the solution to Pablo’s problem already. Oranges 
are very sugary, so they should be kept low, thus y = 7. Also, the less fruit 
the better, so the answer had better lie on the line x + y = 15. Hence, 
the answer must be at the vertex (8,7). Actually this is a general feature 
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of linear programming problems, the optimal answer must lie at a vertex of 
the feasible region. Rather than prove this, lets look at a plot of the linear 
function s(x, y) = 5a + 10y. 


Example 34 (The sugar function) 
Plotting the sugar function requires three dimensions: 


s=215 


The plot of a linear function of two variables is a plane through the origin. 
Restricting the variables to the feasible region gives some lamina in 3-space. 
Since the function we want to optimize is linear (and assumedly non-zero), if 
we pick a point in the middle of this lamina, we can always increase /decrease 
the function by moving out to an edge and, in turn, along that edge to a 
corner. Applying this to the above picture, we see that Pablo’s best option 
is 110 grams of sugar a week, in the form of 8 apples and 7 oranges. 


It is worthwhile to contrast the optimization problem for a linear function 
with the non-linear case you may have seen in calculus courses: 
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x x 


Here we have plotted the curve f(x) = d in the case where the function f is 
linear and non-linear. To optimize f in the interval [a,b], for the linear case 
we just need to compute and compare the values f(a) and f(b). In contrast, 
for non-linear functions it is necessary to also compute the derivative df /dx 
to study whether there are extrema inside the interval. 


3.3 Dantzig’s Algorithm 


In simple situations a graphical method might suffice, but in many applica- 
tions there may be thousands or even millions of variables and constraints. 
Clearly an algorithm that can be implemented on a computer is needed. The 
simplex algorithm (usually attributed to George Dantzig) provides exactly 
that. It begins with a standard problem: 


Problem 35 Maximize f(x1,..., £n) where f is linear, x; > 0 (i = 1,...,n) sub- 
ject to 
Tı 
Mzx=v, z := : ; 
Tn 


where the m x n matrix M and m x 1 column vector v are given. 


This is solved by arranging the information in an augmented matrix and 
then applying EROs. To see how this works lets try an example. 
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Example 36 Maximize f(z, y, z,w) = 3x — 3y — z + 4w subject to constraints 


cq := rt+y+z+w = 5 
c@ := £+2y+3z4+2w = 6, 


where x > 0, y>0, z>Oandw> 0. 


The key observation is this: Suppose we are trying to maximize f(x1,...,2n) 
subject to a constraint c(z1,..., £n) = k for some constant k (c and k would 
be the entries of Mx and v, respectively, in the above). Then we can also 
try to maximize 


f (£1, ..-,2n) OC Wig 2 a En) 


because this is only a constant shift f — f +ak. Choosing a carefully can 
lead to a simple form for the function we are extremizing. 


Example 37 (Setting up an augmented matrix): 
Since we are interested in the optimum value of f, we treat it as an additional 
variable and add one further equation 


We arrange this equation and the two constraints in an augmented matrix 


1 1ii1 1 0/5 GQ = 5 
1 2 3 2 0] 6 <€ C&Q = 6 
—3 3 1-4 1 | 0 f = 3x- 3y- z+ 4w 


Keep in mind that the first four columns correspond to the positive variables (x, y, z, w) 
and that the last row has the information of the function f. The general case is depicted 
in figure 3.1. 


Now the system is written as an augmented matrix where the last row 
encodes the objective function and the other rows the constraints. Clearly we 
can perform row operations on the constraint rows since this will not change 
the solutions to the constraints. Moreover, we can add any amount of the 
constraint rows to the last row, since this just amounts to adding a constant 
to the function we want to extremize. 
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variables (incl. slack and artificial) objective 


< constraint equations 


| + objective equation 


n 


objective value 


Figure 3.1: Arranging the information of an optimization problem in an 
augmented matrix. 


Example 38 (Performing EROs) 

We scan the last row, and notice the (most negative) coefficient —4. Naively you 
might think that this is good because this multiplies the positive variable w and only 
helps the objective function f = 4w-+---. However, what this actually means is that 
the variable w will large but determined by the constraints. Therefore we want to 
remove it from the objective function. We can zero out this entry by performing a 
row operation. For that, either of first two rows could be used. To decide which, we 
remember that the we still have to solve solve the constraints for variables that are 
positive. Hence we should try to keep the first two entries in the last column positive. 
Hence we choose the row which will add the smallest constant to f when we zero out 
the —4: Look at the last column (where the values of the constraints are stored). We 
see that adding four times the first row to the last row would zero out the —4 entry 
but add 20 to f, while adding two times the second row to the last row would also 
zero out the —4 but only add 12 to f. (You can follow this by watching what happens 
to the last entry in the last row.) So we perform the latter row operation and obtain 
the following: 


1 1 11 0j 5 a = 5 
1232 0| 6 c. = 6 
-1 77 0 1] 12 ft22cg = 12+xz-—Ty-Tz. 


We do not want to undo any of our good work when we perform further row operations, 
so now we use the second row to zero out all other entries in the fourth column. This 
is achieved by subtracting half the second row from the first: 


5 0 -5 0 oj 2 ca- io = 2 
12 32 0| 6 c. = 6 
-1 7 70 1] 12 ft = 124 ¢—Ty— 72: 
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Precisely because we chose the second row to perform our row operations, all entries 
in the last column remain positive. This allows us to continue the algorithm. 

We now repeat the above procedure: There is a —1 in the first column of the last 
row. We want to zero it out while adding as little to f as possible. This is achieved 
by adding twice the first row to the last row: 


12 32 0| 6 c = 6 
07 6 0 1] 16 f +2c2+2(c1 — $e.) = 16-— Ty- 6z. 


The Dantzig algorithm terminates if all the coefficients in the last row (save perhaps 
for the last entry which encodes the value of the objective) are positive. To see why 
we are done, lets write out what our row operations have done in terms of the function 
f and the constraints (c1, c2). First we have 


1 
f +2c2 + 2(c4 502) = 16 — Ty — 6z 


with both y and z positive. Hence to maximize f we should choose y = 0 = z. In 
which case we obtain our optimum value 


f=16. 


Finally, we check that the constraints can be solved with y = 0 = z and positive 
(x,w). Indeed, they can by taking z = 2 = w. 


3.4 Pablo Meets Dantzig 


Oftentimes, it takes a few tricks to bring a given problem into the standard 
form of example 36. In Pablo’s case, this goes as follows. 


Example 39 Pablo’s variables x and y do not obey x; > 0. Therefore define new 
variables 
Zy=x2-—5, t2=y-T. 


The conditions on the fruit 15 < x + y < 25 are inequalities, 
zı +T2 > 3, m+2%.< 13, 


so are not of the form Ma = v. To achieve this we introduce two new positive 
variables 73 > 0, x4 > 4 and write 


C1 := £1 + T2 — £3 = 3, C2 := T1 + T2 + T4 = 13. 
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These are called slack variables because they take up the “slack” required to convert 
inequality to equality. This pair of equations can now be written as Max = v, 


1 1-1 Afe] (3 
11 01) 423} 3) ° 


Finally, Pablo wants to minimize sugar s = 5x + 10y, but the standard problem 
maximizes f. Thus the so-called objective function f = —s +95 = —5x1 — 1022. 
(Notice that it makes no difference whether we maximize —s or —s + 95, we choose 
the latter since it is a linear function of (x1, 22).) Now we can build an augmented 
matrix whose last row reflects the objective function equation 52; + 10x2 + f = 0: 


1 
1 
5 10 0 0 1] 0 

Here it seems that the simplex algorithm already terminates because the last row only 
has positive coefficients, so that setting xı = 0 = x2 would be optimal. However, this 
does not solve the constraints (for positive values of the slack variables x3 and 2x4). 
Thus one more (very dirty) trick is needed. We add two more, positive, (so-called) 
artificial variables x5 and x6 to the problem which we use to shift each constraint 


Cj > Cy — %5, C2 > C2 — T6. 
The idea being that for large positive œ, the modified objective function 
f — azs — are 


is only maximal when the artificial variables vanish so the underlying problem is un- 
changed. Lets take a = 10 (our solution will not depend on this choice) so that our 
augmented matrix reads 


1 1-10 1 0 0| 3 

1 1 01 0 1 0/18 

5 10 0 0 10 10 1) 0 
R}=R3—10R1—10R2 i T a eee 2 
~ 1 1 0 1010| B 
-15 —10 10 —10 0 0 1|—160 


Here we performed one row operation to zero out the coefficients of the artificial 
variables. Now we are ready to run the simplex algorithm exactly as in section 3.3. 
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The first row operation uses the 1 in the top of the first column to zero out the most 
negative entry in the last row: 


= 
= 
© 
= 
=) 
= 
o 
= 
w 


j=) 
Ot 
l 
Ol 
l 
— 
© 
jæ 
Ol 
=] 
j 
l 
=. 
=. 
ot 


RL=R2-Rı 


AER 110 1 00| 3 
cee aie 011-1 1 0] 10 
055 0 5 10 1]-15 


Now the variables (22,73, 25,26) have zero coefficients so must be set to zero to 
maximize f. The optimum value is f = —15 so s = —f —95 = 110 exactly as before. 
Finally, to solve the constraints zı = 3 and x4 = 10 so that z = 8 and y = 7 which 
also agrees with our previous result. 


Clearly, performed by hand, the simplex algorithm was slow and complex 
for Pablo’s problem. However, the key point is that it is an algorithm that 
can be fed to a computer. For problems with many variables, this method is 
much faster than simply checking all vertices as we did in section 3.2. 


3.5 Review Problems 
1. Maximize f(x,y) = 2x + 3y subject to the constraints 
t>0, y>0, r4+2y<2, 2r+y<2, 
by 


(a) sketching the region in the xy-plane defined by the constraints 
and then checking the values of f at its corners; and, 


(b) the simplex algorithm (hint: introduce slack variables). 
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To continue our linear algebra journey, we must discuss n-vectors with an 
arbitrarily large number of components. The simplest way to think about 
these is as ordered lists of numbers, 


a” 


Do not be confused by our use of a superscript to label components of a vector. 
Here a? denotes the second component of the vector a, rather than the number 
a squared! 

We emphasize that order matters: 


Example 40 (Order of Components Matters) 


INAN 
$ 
am NAN 


The set of all n-vectors is denoted R”. As an equation 


= oN Geena ER 
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4.1 Addition and Scalar Multiplication in R” 


A simple but important property of n-vectors is that we can add n-vectors 
and multiply n-vectors by a scalar: 


Definition Given two n-vectors a and b whose components are given by 


a! b! 
a = : and b= 
a” b” 
their sum is 
apl 
a+b := : 
a” +b” 
Given a scalar A, the scalar multiple 
Aa! 
AG = : 
Aa” 
Example 41 Let 
1 4 
a= : and b= : 
4 1 
Then, for example, 
5 =j 
a+b= and 3a -— 2b = 
5 10 


A special vector is the zero vector. All of its components are zero: 


0 
0 = $ 
0 


In Euclidean geometry—the study of R” with lengths and angles defined 
as in section 4.3 —n-vectors are used to label points P and the zero vector 
labels the origin O. In this sense, the zero vector is the only one with zero 
magnitude, and the only one which points in no particular direction. 
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4.2 WHyperplanes 


Vectors in R” can be hard to visualize. However, familiar objects like lines 
and planes still make sense: The line L along the direction defined by a 
vector v and through a point P labeled by a vector u can be written as 


L= {u+ tvlt € R}. 


Sometimes, since we know that a point P corresponds to a vector, we will 
be lazy and just write L = {P + tv|t € R}. 


Example 42 +t 


mewn re 
Oo: © y 


| t € R > describes a line in R4 parallel to the x1-axis. 


Given two non-zero vectors u,v, they will usually determine a plane, 


unless both vectors are in the same line, in which case, one of the vectors 
is a scalar multiple of the other. The sum of u and v corresponds to laying 
the two vectors head-to-tail and drawing the connecting vector. If u and v 
determine a plane, then their sum lies in the plane determined by u and v. 


73 


Vectors in Space, n-Vectors 


The plane determined by two vectors u and v can be written as 


{P + su + tv|s,t € R}. 


Example 43 
3 1 0 
1 0 1 
+s : +t : s,tER 
5 0 0 
9 0 0 


describes a plane in 6-dimensional space parallel to the xy-plane. 


rN Parametric Notation 


We can generalize the notion of a plane: 


Definition A set of k vectors v1,..., Ug in R” with k < n determines a 
k-dimensional hyperplane, unless any of the vectors v; lives in the same hy- 
perplane determined by the other vectors. If the vectors do determine a 
k-dimensional hyperplane, then any point in the hyperplane can be written 
as: 


A 


k 
[p+ aulne | 
i=1 


When the dimension k is not specified, one usually assumes that k = n — 1 
for a hyperplane inside R”. 
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4.3 Directions and Magnitudes 


Consider the Euclidean length of a vector: 


loll := J (vt)? + (v?)2 +--+ (ut)? = 


Using the Law of Cosines, we can then figure out the angle between two 
vectors. Given two vectors v and u that span a plane in R”, we can then 
connect the ends of v and u with the vector v — u. 


Then the Law of Cosines states that: 


lv — ull? = Jull? + lloll? — 2llul] [lel] cos 0 


Then isolate cos 0: 


| 
— 

Ss 
a 

| 

g 


lv — ull? — llul? — loll? 


II 
l 
w 
2 
= 
e 
= 
| 
| 
w 
2 
e 
3 


Thus, 
lul] lvl] cos 0 = utut +- + uv”. 


Note that in the above discussion, we have assumed (correctly) that Eu- 
clidean lengths in R” give the usual notion of lengths of vectors for any plane 
in R”. This now motivates the definition of the dot product. 
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Definition The dot product of two vectors u = : and v = : is 


u.v = ulv! tee tue”. 


The length or norm or magnitude of a vector 
loll <= vv +v. 
The angle 0 between two vectors is determined by the formula 
u.v = |lul|||v|| cos ð . 


Remark When the dot product between two vectors vanishes, we say that they are 
perpendicular or orthogonal. Notice that the zero vector is orthogonal to every vector. 


The dot product has some important properties: 
1. The dot product is symmetric, so 


uv =v'u, 


2. Distributive so 
u: (v+w)=u' v+u: w, 
3. Bilinear, which is to say, linear in both u and v. Thus 
us(cu+dw)=curvt+du-w, 


and 
(cut+tdw):v=curvtdw:v. 


4. Positive Definite: 
u'u >o, 


and u-u = 0 only when u itself is the 0-vector. 
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There are, in fact, many different useful ways to define lengths of vectors. 
Notice in the definition above that we first defined the dot product, and then 
defined everything else in terms of the dot product. So if we change our idea 
of the dot product, we change our notion of length and angle as well. The 
dot product determines the Euclidean length and angle between two vectors. 

Other definitions of length and angle arise from inner products, which 
have all of the properties listed above (except that in some contexts the 
positive definite requirement is relaxed). Instead of writing + for other inner 
products, we usually write (u,v) to avoid confusion. 


cv Reading homework: problem 1 


Example 44 Consider a four-dimensional space, with a special direction which we will 
call “time”. The Lorentzian inner product on R£ is given by (u,v) = utv! + u?v? + 
u®v® — utv4. This is of central importance in Einstein's theory of special relativity. 
Note, in particular, that it is not positive definite. 

As a result, the “squared-length” of a vector with coordinates x,y,z and t is 
lv]? = a? + y? + 2? — t?. Notice that it is possible for ||v||? < 0 even with non- 
vanishing v! 


Theorem 4.3.1 (Cauchy-Schwarz Inequality). For non-zero vectors u and v 
with an inner-product ( , ), 


Kur) ey 
lel Te 


Proof. The easiest proof would use the definition of the angle between two 
vectors and the fact that cos@ < 1. However, strictly speaking speaking we 
did not check our assumption that we could apply the Law of Cosines to the 
Euclidean length in R”. There is, however a simple algebraic proof. Let a be 
any real number and consider the following positive, quadratic polynomial 
in a 


0<(utav,ut+av) = (u,u) + 2alu,v) + a? (v, Vv). 


You should carefully check for yourself exactly which properties of an inner 
product were used to write down the above inequality! 

Next, a tiny calculus computation shows that any quadratic aa? +2ba+c 
takes its minimal value c — 2 when a = —2. Applying this to the above 
quadratic gives 
(u,v)? 


(vv) 
7 


0 < (u,u) — 
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Now it is easy to rearrange this inequality to reach the Cauchy—Schwarz one 
above. 


Theorem 4.3.2 (Triangle Inequality). Given vectors u and v, we have: 
lu + ol] < [fell + Tel] 
Proof. 


lu + vll? 


(u +v). (u +v) 

= urut2urv+verv 

+ [lvl]? + 2 [lul] [lel] cos 6 

= (llull + llel)? + 2 [lel] [lel (cos @ — 1) 
< (lell + le” 


I 

= 

To 
| 


A 


Then the square of the left-hand side of the triangle inequality is < the 
right-hand side, and both sides are positive, so the result is true. 


The triangle inequality is also “self-evident” examining a sketch of u, v 
and u +v: 


u 
Ilu] Iiv] 
V 
|U+v]| ne 
Example 45 Let 
a= and b= 


A Ne 
. Nv A 


so that 
ata =b:b= 1+2? +37 +4? = 30 
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lall = V30 = [Ol] and (\lal| + [|bl|)* = (2V30)? = 120. 


Since 


a+b= 


Or Ot Ot OT 


we have 
lla + bl? = 5? + 5? + 5? + 5? = 100 < 120 = (lal + loll)? 


as predicted by the triangle inequality. 
Notice also that a • b = 1.4 + 2.3 + 3.2 + 4.1 = 20 < V30.V30 = 30 = |lal| |[5|| in 
accordance with the Cauchy—Schwarz inequality. 


ON Reading homework: problem 2 


4.4 Vectors, Lists and Functions 


Suppose you are going shopping. You might jot down something like this on 
a piece of paper: 


We could represent this information mathematically as a set, 
S = {apple, orange, onion, milk, carrot} . 
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There is no information of ordering here and no information about how many 
carrots you will buy. This set by itself is not a vector; how would we add 
such sets to one another? 

If you were a more careful shopper your list might look like this: 


What you have really done here is assign a number to each element of the 
set S. In other words, the second list is a function 


f: SOR. 


Given two lists like the second one above, we could easily add them — if you 
plan to buy 5 apples and I am buying 3 apples, together we will buy 8 apples! 
In fact, the second list is really a 5-vector in disguise. 

In general it is helpful to think of an n-vector as a function whose domain 
is the set {1,...,n}. This is equivalent to thinking of an n-vector as an 
ordered list of n numbers. These two ideas give us two equivalent notions for 
the set of all n-vectors: 


a 
R” := l a!,...a” ER> = {a : {1,...,n} > R} := RO 
a” 
The notation Rt" ”} is used to denote functions from {1,...,n} to R. Sim- 


ilarly, for any set S the notation RS denotes the set of functions from S 
to R: 


R? := {f : S > R}. 
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When S is an ordered set like {1,...,n}, it is natural to write the components 
in order. When the elements of S do not have a natural ordering, doing so 
might cause confusion. 


Example 46 Consider the set S = {x,«,#} from chapter 1 review problem 9. A 
particular element of IR° is the function a explicitly defined by 


a* = 3,a% =5,a* = —2. 


It is not natural to write 


3 —2 
a= 5] ra= 3 
—2 5 


because the elements of S do not have an ordering, since as sets {*, x, ##} = {*, x, #}. 


In this important way, R° seems different from RÌ. What is more evident 


are the similarities; since we can add two functions, we can add two elements 
S. 
of R°: 


Example 47 Addition in R 
If a* = 3,07 = 5,a* = —2 and b* = —2, bř = 4,b* = 13 then a + b is the function 


(a+b)* =3-—2= 1, (a +b)ř =5+4=9, (a +b) = -2 +13 =11. 


Also, since we can multiply functions by numbers, there is a notion of scalar 
multiplication on RS: 


Example 48 Scalar Multiplication in R° 
If a* = 3,a* = 5,a* = —2, then 3a is the function 


(3a)* = 3-3 = 9, (3a)* = 3-5 = 15, (3a)* = 3(-2) = —6. 


We visualize R? and R? in terms of axes. We have a more abstract picture 
of Rt, R? and R” for larger n while R° seems even more abstract. However, 
when thought of as a simple “shopping list”, you can see that vectors in RS 
in fact, can describe everyday objects. In chapter 5 we introduce the general 
definition of a vector space that unifies all these different notions of a vector. 
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4.5 Review Problems 


Reading problems Len, 264 
Vector operations 3 
Vectors and lines 4 
Webwork: Vectors and planes 5 
Lines, planes and vectors 6,7 
Equation of a plane 8,9 
Angle between a line and plane 10 


1. When he was young, Captain Conundrum mowed lawns on weekends to 
help pay his college tuition bills. He charged his customers according to 
the size of their lawns at a rate of 5¢ per square foot and meticulously 
kept a record of the areas of their lawns in an ordered list: 


A = (200, 300, 50, 50, 100, 100, 200, 500, 1000, 100) . 


He also listed the number of times he mowed each lawn in a given year, 
for the year 1988 that ordered list was 


f = (20, 1, 2, 4, 1,5, 2, 1, 10,6). 


(a) Pretend that A and f are vectors and compute A: f. 
(b) What quantity does the dot product A+ f measure? 


(c) How much did Captain Conundrum earn from mowing lawns in 
1988? Write an expression for this amount in terms of the vectors 
A and f. 


(d) Suppose Captain Conundrum charged different customers differ- 
ent rates. How could you modify the expression in part 1c to 
compute the Captain’s earnings? 


2. (2) Find the angle between the diagonal of the unit square in R? and 
one of the coordinate axes. 


(3) Find the angle between the diagonal of the unit cube in R? and 
one of the coordinate axes. 


(n) Find the angle between the diagonal of the unit (hyper)-cube in 
R” and one of the coordinate axes. 
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(co) What is the limit as n —> oo of the angle between the diagonal of 
the unit (hyper)-cube in R” and one of the coordinate axes? 


3. Consider the matrix M = oa ne and the vector X = : . 
—sin@ cos y 


(a) Sketch X and MX in R? for several values of X and 8. 


IMXI] 
(b) Compute Bal 


for arbitrary values of X and 9. 


(c) Explain your result for (b) and describe the action of M geomet- 
rically. 


4. (Lorentzian Strangeness). For this problem, consider R” with the 
Lorentzian inner product defined in example 44 above. 


(a) Find a non-zero vector in two-dimensional Lorentzian space-time 
with zero length. 

(b) Find and sketch the collection of all vectors in two-dimensional 
Lorentzian space-time with zero length. 

(c) Find and sketch the collection of all vectors in three-dimensional 
Lorentzian space-time with zero length. 


@ The Story of Your Life 


5. Create a system of equations whose solution set is a 99 dimensional 
hyperplane in R'™. 


6. Recall that a plane in R? can be described by the equation 


where the vector p labels a given point on the plane and n is a vector 
normal to the plane. Let N and P be vectors in R!°! and 
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10. 


What kind of geometric object does N - X = N - P describe? 


Let 
1 1 
1 2 
u=|1 and v = 3 
1 101 


Find the projection of v onto u and the projection of u onto v. (Hint: 
Remember that two vectors u and v define a plane, so first work out 
how to project one vector onto another in a plane. The picture from 
Section 14.4 could help.) 


. If the solution set to the equation A(x) = b is the set of vectors whose 


tips lie on the paraboloid z = x? + y?, then what can you say about 
the function A? 


. Find a system of equations whose solution set is 


1 —1 0 
1 —1 0 
2 + Cy 0 + C2 1 C1, C2 € R 
0 1 —3 


Give a general procedure for going from a parametric description of a 
hyperplane to a system of equations with that hyperplane as a solution 
set. 


If A is a linear operator and both x = v and x = cv (for any real 
number c) are solutions to Ax = b, then what can you say about b? 
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Vector Spaces 


As suggested at the end of chapter 4, the vector spaces R” are not the only 
vector spaces. We now give a general definition that includes R” for all 
values of n, and R for all sets S, and more. This mathematical structure is 
applicable to a wide range of real-world problems. 

The two key properties of vectors are that they can be added together 
and multiplied by scalars, so we make the following definition. 


Definition A vector space (V,+,.,R) is aset V with two operations + and - 
satisfying the following properties for all u,v € V and c,d E€ R: 


(+i) (Additive Closure) u +v € V. Adding two vectors gives a vector. 


(tii) (Additive Commutativity) u +v = v +u. Order of addition doesn’t 
matter. 


(+iii) (Additive Associativity) (u + v) + w = u + (v + w). Order of adding 
many vectors doesn’t matter. 


(+iv) (Zero) There is a special vector Oy € V such that u + Oy = u for all u 
in V. 


(+v) (Additive Inverse) For every u € V there exists w € V such that 
u+ w= Oy. 


(- i) (Multiplicative Closure) c-v € V. Scalar times a vector is a vector. 
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(- ii) (Distributivity) (c+d)-v =c-v+d-v. Scalar multiplication distributes 
over addition of scalars. 


(- iii) (Distributivity) c- (u+v) =c-u+c-v. Scalar multiplication distributes 
over addition of vectors. 


(- iv) (Associativity) (cd) -v = c- (d - v). 


(- v) (Unity) 1-v =v forall v Ee V. 


An Examples of each rule 


Remark Rather than writing (V, +,., R), we will often say “let V be a vector space 
over R”. If it is obvious that the numbers used are real numbers, then “let V be a 
vector space” suffices. Also, don’t confuse the scalar product - with the dot product +. 
The scalar product is a function that takes as inputs a number and a vector and returns 
a vector as its output. This can be written: 


<RxV >V. 


Similarly 

+:VxV >V. 
On the other hand, the dot product takes two vectors and returns a number. Suc- 
cinctly: +: V x V — R. Once the properties of a vector space have been verified, 
we'll just write scalar multiplication with juxtaposition cv = c- v, though, to avoid 
confusing the notation. 


5.1 Examples of Vector Spaces 


One can find many interesting vector spaces, such as the following: 


an Example of a vector space 


Example 49 
RN ={f|f:N>R} 


Here the vector space is the set of functions that take in a natural number n and return 
a real number. The addition is just addition of functions: (fı + f2)(n) = fı(n)+ f2(n). 
Scalar multiplication is just as simple: c- f(n) = cf (n). 
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We can think of these functions as infinitely large ordered lists of numbers: f(1) = 
1° = 1 is the first component, f(2) = 2° = 8 is the second, and so on. Then for 
example the function f(n) = n? would look like this: 


Thinking this way, IRN is the space of all infinite sequences. Because we can not write 
a list infinitely long (without infinite time and ink), one can not define an element of 
this space explicitly; definitions that are implicit, as above, or algebraic as in f(n) = n? 
(for all n € N) suffice. 

Let’s check some axioms. 


(+i) (Additive Closure) (fi + f2)(n) = fi(n) + fo(n) is indeed a function N > R, 
since the sum of two real numbers is a real number. 


(+iv) (Zero) We need to propose a zero vector. The constant zero function g(n) = 0 
works because then f(n) + g(n) = f(n) +0 = f(n). 


The other axioms should also be checked. This can be done using properties of the 
real numbers. 


ON Reading homework: problem 1 
Example 50 The space of functions of one real variable. 


R? ={f|f:R-R} 


The addition is point-wise 


as is scalar multiplication 


To check that RË is a vector space use the properties of addition of functions and 
scalar multiplication of functions as in the previous example. 
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We can not write out an explicit definition for one of these functions either, there 
are not only infinitely many components, but even infinitely many components between 
any two components! You are familiar with algebraic definitions like f(a) = e* ~*+°. 
However, most vectors in this vector space can not be defined algebraically. For 
example, the nowhere continuous function 


1, rEQ 
i=) ye 


Example 51 Rt**#} = {f : {x,+,#} — R}. Again, the properties of addition and 
scalar multiplication of functions show that this is a vector space. 


You can probably figure out how to show that R*° is vector space for any 
set S. This might lead you to guess that all vector spaces are of the form RÙ 
for some set S. The following is a counterexample. 


Example 52 Another very important example of a vector space is the space of all 
differentiable functions: 


[ARR] Ff eiss}. 


From calculus, we know that the sum of any two differentiable functions is dif- 
ferentiable, since the derivative distributes over addition. A scalar multiple of a func- 
tion is also differentiable, since the derivative commutes with scalar multiplication 
(Æ (cf) = c£ f). The zero function is just the function such that 0(x) = 0 for ev- 
ery x. The rest of the vector space properties are inherited from addition and scalar 
multiplication in R. 


Similarly, the set of functions with at least k derivatives is always a vector 
space, as is the space of functions with infinitely many derivatives. None of 
these examples can be written as R’ for some set S. Despite our emphasis on 
such examples, it is also not true that all vector spaces consist of functions. 
Examples are somewhat esoteric, so we omit them. 

Another important class of examples is vector spaces that live inside R” 
but are not themselves R”. 


Example 53 (Solution set to a homogeneous linear equation.) 
Let 


1 1 1 
M= |2 2 2 
33 3 
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5.1 Examples of Vector Spaces 


The solution set to the homogeneous equation Mg = 0 is 


—1 —1 
C1 1| +c 0 | |c&, c2 E R 
0 1 
1 
This set is not equal to R since it does not contain, for example, | 0 |. The sum of 
0 
any two solutions is a solution, for example 
=] =| =l =1 —1 =Í 
2 1] +3| 0J] +17 1] +5 0 =9 1| +8 
1 0 1 0 1 


and any scalar multiple of a solution is a solution 


—1 = —1 —1 
415 1] —-3 0 = 20 1} —12 0 
1 1 


This example is called a subspace because it gives a vector space inside another vector 
space. See chapter 9 for details. Indeed, because it is determined by the linear map 
given by the matrix M, it is called ker M, or in words, the kernel of M, for this see 
chapter 16. 


Similarly, the solution set to any homogeneous linear equation is a vector 
space: Additive and multiplicative closure follow from the following state- 
ment, made using linearity of matrix multiplication: 


If Mx, = 0 and Mazz = 0 then M (c1z1ı+c2£2) = Ma, +c2M£2 = 0+0 = 0. 


A powerful result, called the subspace theorem (see chapter 9) guarantees, 
based on the closure properties alone, that homogeneous solution sets are 
vector spaces. 

More generally, if V is any vector space, then any hyperplane through 
the origin of V is a vector space. 


Example 54 Consider the functions f(x) = e” and g(x) = e?” in R®. By taking 
combinations of these two vectors we can form the plane {c1 f + c2g|c1, co € R} inside 
of R®. This is a vector space; some examples of vectors in it are 4e*—31e?”, me?” —4e* 
and 42, 
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A hyperplane which does not contain the origin cannot be a vector space 
because it fails condition (+iv). 

It is also possible to build new vector spaces from old ones using the 
product of sets. Remember that if V and W are sets, then their product is 
the new set 


VxWe={(v,w)|uEeV,weWw}, 


or in words, all ordered pairs of elements from V and W. In fact V x W isa 
vector space if V and W are. We have actually been using this fact already: 


Example 55 The real numbers R form a vector space (over R). The new vector space 
Rx R= {(z,y)|c € R,y ER} 
has addition and scalar multiplication defined by 
(x,y) + (x,y) =(a@+a',y+y') and c.(a,y) = (ca, cy). 


Of course, this is just the vector space R? = Rt!.2}, 


5.1.1 Non-Examples 


The solution set to a linear non-homogeneous equation is not a vector space 
because it does not contain the zero vector and therefore fails (iv). 


Example 56 The solution set to 
11\fe\_f1 
0 0/ \y/ \0 
; 1 —1 O\. . ; 
is {(5) + e( 7) |e = r). The vector o is not in this set. 


Do notice that once just one of the vector space rules is broken, the example 
is not a vector space. 
Most sets of n-vectors are not vector spaces. 


Example 57 P := [a,b > o} is not a vector space because the set fails (-i) 


(; 
sa (Jorm a()- (3) e 
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Sets of functions other than those of the form R° should be carefully 
checked for compliance with the definition of a vector space. 


Example 58 The set of all functions which are never zero 
{f: R-R| f(z) £0 for any z € R}, 


does not form a vector space because it does not satisfy (+i). The functions f(x) = 
x?+1 and g(x) = —5 are in the set, but their sum (f+g)(x) = 27-4 = (x+2)(x—2) 
is not since (f + g)(2) = 0. 


5.2 Other Fields 


Above, we defined vector spaces over the real numbers. One can actually 
define vector spaces over any field. This is referred to as choosing a different 
base field. A field is a collection of “numbers” satisfying properties which are 
listed in appendix B. An example of a field is the complex numbers, 


C={r+iy|?=—-l,zye R}. 


Example 59 In quantum physics, vector spaces over C describe all possible states a 
physical system can have. For example, 


r{Q)ianed 


is the set of possible states for an electron’s spin. The vectors 6. and C 


i describe, 


‘up” and “down” along a given direction. Other 


respectively, an electron with spin 


vectors, like ( `) are permissible, since the base field is the complex numbers. Such 


states represent a mixture of spin up and spin down for the given direction (a rather 
counterintuitive yet experimentally verifiable concept), but a given spin in some other 
direction. 


Complex numbers are very useful because of a special property that they 
enjoy: every polynomial over the complex numbers factors into a product of 
linear polynomials. For example, the polynomial 


xr? +1 
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doesn’t factor over real numbers, but over complex numbers it factors into 


(x +i)(x-— i). 
In other words, there are two solutions to 
r? = —1, 
x = i and x = —i. This property ends has far-reaching consequences: often 


in mathematics problems that are very difficult using only real numbers, 
become relatively simple when working over the complex numbers. This 
phenomenon occurs when diagonalizing matrices, see chapter 13. 

The rational numbers Q are also a field. This field is important in com- 
puter algebra: a real number given by an infinite string of numbers after the 
decimal point can’t be stored by a computer. So instead rational approxi- 
mations are used. Since the rationals are a field, the mathematics of vector 
spaces still apply to this special case. 

Another very useful field is bits 


Bz = Z = {0,1}, 


with the addition and multiplication rules 


+|0 1 x|o 1 
0/0 1 0/0 0 
1] 4.0 ia 1 


These rules can be summarized by the relation 2 = 0. For bits, it follows 
that —1 = 1! 

The theory of fields is typically covered in a class on abstract algebra or 
Galois theory. 


5.3 Review Problems 


Reading problems lm 


WEBWORK Adaro ad averse) |S 


A 


Te = R? (with the usual addition and scalar 


1. Check that l (; 


multiplication) satisfies all of the parts in the definition of a vector 
space. 
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2. (a) Check that the complex numbers C = {x + iy | i? = —1, x,y € R}, 
satisfy all of the parts in the definition of a vector space over C. 
Make sure you state carefully what your rules for vector addition 
and scalar multiplication are. 


(b) What would happen if you used R as the base field (try comparing 
to problem 1). 


3. (a) Consider the set of convergent sequences, with the same addi- 
tion and scalar multiplication that we defined for the space of 
sequences: 


V={flf:N>R, lim f ER} CR". 


Is this still a vector space? Explain why or why not. 


(b) Now consider the set of divergent sequences, with the same addi- 
tion and scalar multiplication as before: 


V={flf:NOR, lim f does not exist or is + oo} cR. 
n00 


Is this a vector space? Explain why or why not. 


4. Consider the set of 2 x 4 matrices: 


a bed 
v={(% ri nj labedefghec) 


Propose definitions for addition and scalar multiplication in V. Identify 
the zero vector in V, and check that every matrix in V has an additive 
inverse. 


5. Let P be the set of polynomials with real coefficients of degree three 


or less. 


(a) Propose a definition of addition and scalar multiplication to make 
PÈ a vector space. 
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(b) Identify the zero vector, and find the additive inverse for the vector 
—3 — 2x + x’. 


(c) Show that P# is not a vector space over C. Propose a small change 
to the definition of P? to make it a vector space over C. 


An Hint 


. Let V = {x € R|x > 0} =: Ry. For z,y € V and à € R, define 


roy = Ty, AQr=-2. 


Prove that (V,, ®, R) is a vector space. 


. The component in the ith row and jth column of a matrix can be 
labeled mi. In this sense a matrix is a function of a pair of integers. 
For what set S is the set of 2 x 2 matrices the same as the set RS? 
Generalize to other size matrices. 


. Show that any function in R{**#} can be written as a sum of multiples 
of the functions e,, €x, € defined by 


1, k=* 0, k= x 0, k=x 
eth=20, =x ,e(= 21, ka , ent) =. 0, k=x 
0, k=# 0, k=# 1, k=# 


. Let V bea vector space and S any set. Show that the set of all functions 
mapping V > S, i.e. V’, is a vector space. Hint: first decide upon a 
rule for adding functions whose outputs are vectors. 
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Linear Transformations 


Definition A function L: V > W is linear if V and W are vector spaces 
and for all u,v € V and r,s € R we have 


L(ru + sv) =rL(u) + sL(v). 


aN Reading homework: problem 1 


Remark We will often refer to linear functions by names like “linear map”, “linear 
operator” or “linear transformation”. In some contexts you will also see the name 
“homomorphism” . 


The definition above coincides with the two part description in chapter 1; 
the case r = 1,s = 1 describes additivity, while s = 0 describes homogeneity. 
We are now ready to learn the powerful consequences of linearity. 


6.1 The Consequence of Linearity 


Now that we have a sufficiently general notion of vector space it is time to 
talk about why linear operators are so special. Think about what is required 
to fully specify a real function of one variable. One output must be specified 
for each input. That is an infinite amount of information. 

By contrast, even though a linear function can have infinitely many ele- 
ments in its domain, it is specified by a very small amount of information. 
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Example 60 If you know that the function L is linear and that 
1 5 
“()=() 
then you do not need any more information to figure out 
2 3 4 5 
(3), 2(°) (2). 28), ee. 
because by homogeneity 
5 1 1 5 25 
(0) =#[()| a = (is) 
In this way an an infinite number of outputs is specified by just one. 
Likewise, if you know that L is linear and that 
1 5 0 2 
x{a)= (3) m4) = G) 


then you don’t need any more information to compute 


(i) 
because by additivity 
HG) =#[(a) + E a RRT 
In fact, since every vector in R? can be expressed as 
G) = (0) +G): 
y 0 1 


we know how L acts on every vector from R? by linearity based on just two pieces of 
information; 


rro O eo) G) == GG) = Gaeta): 


Thus, the value of L at infinitely many inputs is completely specified by its value at 
just two inputs. (We can see now that L acts in exactly the way the matrix 


5 2 
3.2 
acts on vectors from R?.) 
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ON Reading homework: problem 2 


This is the reason that linear functions are so nice; they are secretly very 
simple functions by virtue of two characteristics: 


1. They act on vector spaces. 


2. They act additively and homogeneously. 


A linear transformation with domain R? is completely specified by the 
way it acts on the three vectors 


1 0 0 
0J, 1], 
0 0 1 


Similarly, a linear transformation with domain R” is completely specified 
by its action on the n different n-vectors that have exactly one non-zero 
component, and its matrix form can be read off this information. However, 
not all linear functions have such nice domains. 


6.2 Linear Functions on Hyperplanes 


It is not always so easy to write a linear operator as a matrix. Generally, 
this will amount to solving a linear systems problem. Examining a linear 
function whose domain is a hyperplane is instructive. 


Example 61 Let 


1 0 
V=H=<aq}1] +e |1|/&,c2 ER 
0 1 


and consider L : V — R? defined by 


1 0 0 0 
0 0 1 


By linearity this specifies the action of L on any vector from V as 


1 0 0 
Lica |l} toatl =(c1+c2) {1 
0 1 0 
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The domain of L is a plane and its range is the line through the origin in the x 
direction. It is clear how to check that L is linear. 
It is not clear how to formulate L as a matrix; since 


C1 0 0 0 C1 0 
L{ato})=f{10 1 a+c | =(at+ce)} 1], 
C2 0 0 0 C2 0 
or since 
Cl 0 0 O Cl 0 
Liqtal= {0 1 0 C1 + C2 = (c1 + c2) 1 
C2 0 0 0 C2 0 


you might suspect that L is equivalent to one of these 3 x 3 matrices. It is not. All 
3 x 3 matrices have R as their domain, and the domain of L is smaller than that. 
When we do realize this L as a matrix it will be as a 3 x 2 matrix. We can tell because 
the domain of L is 2 dimensional and the codomain is 3 dimensional. 


6.3 Linear Differential Operators 


Your calculus class became much easier when you stopped using the limit 
definition of the derivative, learned the power rule, and started using linearity 
of the derivative operator. 


Example 62 Let V be the vector space of polynomials of degree 2 or less with standard 
addition and scalar multiplication. 


V = {ao : 1 + aiz + azz°|ao, a1, a2 € R} 


Let d V — V be the derivative operator. The following three equations, along with 
linearity of the derivative operator, allow one to take the derivative of any 2nd degree 
polynomial: 


d d > 
he See =o 
ae ee 
In particular 
d d d 
ge I + ax 4 azz’) = ao +1 Ola azt = 0 + a1 + 2ap. 


Thus, the derivative acting any of the infinitely many second order polynomials is 
determined by its action for just three inputs. 
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6.4 Bases (Take 1) 


The central idea of linear algebra is to exploit the hidden simplicity of linear 
functions. It ends up there is a lot of freedom in how to do this. That 
freedom is what makes linear algebra powerful. 


You saw that a linear operator acting on R? is completely specified by 


how it acts on the pair of vectors P and . In fact, any linear operator 


acting on R? is also completely specified by how it acts on the pair of vectors 
1 and l 
1 -1j 

Example 63 The linear operator L is a linear operator then it is completely specified 


by the two equalities 
1 2 1 6 
Baa) 


This is because any vector (*) in R? is a sum of multiples of C) and B which 


can be calculated via a linear systems problem as follows: 


A OERA E) 


Thus 


We can then calculate how L acts on any vector by first expressing the vector as a 
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sum of multiples and then applying linearity; 
x a+y (il r—-yY 1 
L = E 
ie) Ala a 
_ e+y 1 £— Yy 1 
se ae) 
_ z+y[(2\ xz—yf6 
~o 2 U) 2 8 
( x+y Co) 
2(r + y) A(x — y) 


E 4x — 2y 
E 6x — y 


Thus L is completely specified by its value at just two inputs. 


It should not surprise you to learn there are infinitely many pairs of 
vectors from R? with the property that any vector can be expressed as a 
linear combination of them; any pair that when used as columns of a matrix 
gives an invertible matrix works. Such a pair is called a basis for R?. 

Similarly, there are infinitely many triples of vectors with the property 
that any vector from R? can be expressed as a linear combination of them: 
these are the triples that used as columns of a matrix give an invertible 
matrix. Such a triple is called a basis for R3. 

In a similar spirit, there are infinitely many pairs of vectors with the 
property that every vector in 


1 0 
V= Cy 1 + C9 1 Cy, C2 E R 
0 1 


can be expressed as a linear combination of them. Some examples are 


1 0 1 1 
V=<q, [1] +e] 2 C1, C2 € RS=<qf1l})+a {3 C1, C2 € R 
0 2 0 2 


Such a pair is a called a basis for V. 
You probably have some intuitive notion of what dimension means (the 
careful mathematical definition is given in chapter 11). Roughly speaking, 
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dimension is the number of independent directions available. To figure out 
the dimension of a vector space, I stand at the origin, and pick a direction. 
If there are any vectors in my vector space that aren’t in that direction, then 
I choose another direction that isn’t in the line determined by the direction I 
chose. If there are any vectors in my vector space not in the plane determined 
by the first two directions, then I choose one of them as my next direction. 
In other words, I choose a collection of independent vectors in the vector 
space (independent vectors are defined in chapter 10). A minimal set of 
independent vectors is called a basis (see chapter 11 for the precise definition). 
The number of vectors in my basis is the dimension of the vector space. Every 
vector space has many bases, but all bases for a particular vector space have 
the same number of vectors. Thus dimension is a well-defined concept. 

The fact that every vector space (over R) has infinitely many bases is 
actually very useful. Often a good choice of basis can reduce the time required 
to run a calculation in dramatic ways! 

In summary: 


A basis is a set of vectors in terms of which it is possible to 
uniquely express any other vector. 


6.5 Review Problems 


Reading problems | 1<Qn, 2 
Linear? 3 
MEN On TS Matrix x vector 4.5 
Linearity 6, 7 


1. Show that the pair of conditions: 


{ L(u+v) = L(u) + L(v) 


L(cv) = cL(v) () 


(valid for all vectors u,v and any scalar c) is equivalent to the single 
condition: 
L(ru + sv) =rL(u)+sL(v), (2) 


(for all vectors u,v and any scalars r and s). Your answer should have 
two parts. Show that (1) = (2), and then show that (2) = (1), 
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. If f is a linear function of one variable, then how many points on the 


graph of the function are needed to specify the function? Give an 
explicit expression for f in terms of these points. 


(a) If p o = ] and p (;) = 3 is it possible that p is a linear 
function? 


(b) If Q(x?) = z? and Q(2x7) = x* is it possible that Q is a linear 
function from polynomials to polynomials? 


. If f is a linear function such that 


(3) =0, and f (5) =1, 


then what is f C) 2 


. Let P, be the space of polynomials of degree n or less in the variable t. 


Suppose L is a linear transformation from P) + P; such that L(1) = 4, 
L(t) = #8, and L(t?) =t- 1. 


(a) Find L(1 + t+ 2). 
(b) Find L(a + bt + ct?). 
(c) Find all values a, b,c such that L(a + bt + ct?) = 1 + 3t + 22°. 


rN Hint 


. Show that the operator Z that maps f to the function Zf defined 


by fey da f(t)dt is a linear operator on the space of continuous 
functions. 


. Let z € C. Recall that we can express z = x + iy where x,y € R, and 


we can form the complex conjugate of z by taking Z = x — iy. The 
function c: R? > R? which sends (x, y) + (x, —y) agrees with complex 
conjugation. 


(a) Show that c is a linear map over R (i.e. scalars in R). 


(b) Show that Z is not linear over C. 
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Matrices are a powerful tool for calculations involving linear transformations. 
It is important to understand how to find the matrix of a linear transforma- 
tion and properties of matrices. 


7.1 Linear Transformations and Matrices 


Ordered, finite-dimensional, bases for vector spaces allows us to express linear 
operators as matrices. 


7.1.1 Basis Notation 


A basis allows us to efficiently label arbitrary vectors in terms of column 
vectors. Here is an example. 


rll @) 


be the vector space of 2 x 2 real matrices, with addition and scalar multiplication 
defined componentwise. One choice of basis is the ordered set (or list) of matrices 


(69:6 ):6 3.6 Drasta 
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Example 64 Let 


abodeR| 


104 Matrices 


Given a particular vector and a basis, your job is to write that vector as a sum of 
multiples of basis elements. Here and arbitrary vector v € V is just a matrix, so we 


k saa C E E 
= afa atele ateli otel 1) 


ae; +bey+cei+de5. 


The coefficients (a, b,c, d) of the basis vectors (e+, e}, e?, e2) encode the information 


of which matrix the vector v is. We store them in column vector by writing 


a a 
v= ael + beh + ce} +de =: (el, eb, ee) |? | =] ? 
d d) p 
a 
The column vector encodes the vector v but is NOT equal to it! (After all, v is 
d 


a matrix so could not equal a column vector.) Both notations on the right hand side of 
the above equation really stand for the vector obtained by multiplying the coefficients 
stored in the column vector by the corresponding basis element and then summing 
over them. 


Next, lets consider a tautological example showing how to label column 
vectors in terms of column vectors: 


Example 65 (Standard Basis of R?) 


The vectors 
o f/ _ (0 
ey = 0}? E2 = 1 


are called the standard basis vectors of R? = Rt!?}. Their description as functions 
of {1,2} are 


1 ifk=1 0 ifk=1 
TERE o ifk=2. 
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It is natural to assign these the order: ej is first and eg is second. An arbitrary vector v 


of R? can be written as 
x 
v= = re, + Yez. 
a 


To emphasize that we are using the standard basis we define the list (or ordered set) 


E = (€1, €2), 


(5) = {e e) (7) := re, + yeg = vV 
yie TaJ a 1 250. 


You should read this equation by saying: 


and write 


“The column vector of the vector v in the basis E is a 


Again, the first notation of a column vector with a subscript E refers to the vector 
obtained by multiplying each basis vector by the corresponding scalar listed in the 
column and then summing these, 7.e. xe; + ye2. The second notation denotes exactly 
the same thing but we first list the basis elements and then the column vector; a 
useful trick because this can be read in the same way as matrix multiplication of a row 
vector times a column vector—except that the entries of the row vector are themselves 
vectors! 


You should already try to write down the standard basis vectors for R” 
for other values of n and express an arbitrary vector in R” in terms of them. 

The last example probably seems pedantic because column vectors are al- 
ready just ordered lists of numbers and the basis notation has simply allowed 
us to “re-express” these as lists of numbers. Of course, this objection does 
not apply to more complicated vector spaces like our first matrix example. 
Moreover, as we saw earlier, there are infinitely many other pairs of vectors 
in R? that form a basis. 


Example 66 (A Non-Standard Basis of R? = Rt!:?}) 


=() =) 


As functions of {1,2} they read 


1 ifk=1 1 ifk=1 
w= { ifk=2 ° a) ={ 5 ifk=2. 


106 


Matrices 


Notice something important: there is no reason to say that 6 comes before b or 
vice versa. That is, there is no a priori reason to give these basis elements one order 
or the other. However, it will be necessary to give the basis elements an order if we 
want to use them to encode other vectors. We choose one arbitrarily; let 


B= (0,8) 


be the ordered basis. Note that for an unordered set we use the {} parentheses while 
for lists or ordered sets we use (). 


As before we define 
x x 
= (b, := xb + yp. 
ae (b, B) (5) rb + yf 


You might think that the numbers x and y denote exactly the same vector as in the 
previous example. However, they do not. Inserting the actual vectors that b and 8 


represent we have 
_ 1 L\) /z+y 
a+voma()+y(a)= O 


Thus, to contrast, we have 


(7G) =) 


Only in the standard basis Æ does the column vector of v agree with the column vector 
that v actually is! 


Based on the above example, you might think that our aim would be to 
find the “standard basis” for any problem. In fact, this is far from the truth. 
Notice, for example that the vector 


1 
v=(j)=ate=0 


written in the standard basis E is just 
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which is actually a simpler column vector! The fact that there are many 
bases for any given vector space allows us to choose a basis in which our 
computation is easiest. In any case, the standard basis only makes sense 
for R”. Suppose your vector space was the set of solutions to a differential 
equation—what would a standard basis then be? 


Example 67 (A Basis For a Hyperplane) 
Lets again consider the hyperplane 


1 0 
V=H=<q}1] +e |1|/&,c2 ER 
0 1 


One possible choice of ordered basis is 


1 0 
B= T1lis b={|1], B= Qi, bo). 
0 1 
With this choice 
E 1 0 x 
(5) := ab) +ybo2=z|1|+y|1] =] «t+y 
B 0 1 y E 
With the other choice of order B’ = (bo, b1) 
. 0 il Yy 
a := zbo +ybo =x |1| +yf|1] =] z+y 
B’ 1 0 x = 


We see that the order of basis elements matters. 


Finding the column vector of a given vector in a given basis usually 
amounts to a linear systems problem: 


Example 68 (Pauli Matrices) 


À ee 


be the vector space of trace-free complex-valued matrices (over C) with basis B = 
(Ox, 0y, 0z), where 


(0 1 (0 = Ao 
s= o? Zli o> Tlo -1/' 


zuve ch 


108 Matrices 


These three matrices are the famous Pauli matrices, they are used to describe electrons 
in quantum theory. Let 
sat & +i 1+ ) 
3—i —2—ij)` 


Find the column vector of v in the basis B. 
For this we must solve the equation 


Se AN OO OS ae 
ou Ze =e € 5) ta C i) ta iy 


This gives three equations, i.e. a linear systems problem, for the a's 


a? — ial = 14+% 
a? + iad = 3-14 
a = —2+i 


with solution 


Hence 
2 
v= 2— 24 
oreo / p 
To summarize, the column vector of a vector v in an ordered basis B = 

(bi, bo, sel son), 

gl 

a 

ar 


is defined by solving the linear systems problem 


n 
v = atb, + a?bs +--+» +0", = X a'b. 
i=1 
The numbers (a!,a?,..., a”) are called the components of the vector v. Two 
useful shorthand notations for this are 
at at 
a a 
v= f = (bı, b2, ... bn) . 
a” e a” 


7.1 Linear Transformations and Matrices 
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7.1.2 From Linear Operators to Matrices 


Chapter 6 showed that linear functions are very special kinds of functions; 
they are fully specified by their values on any basis for their domain. A 
matrix records how a linear operator maps an element of the basis to a sum 
of multiples in the target space basis. 

More carefully, if L is a linear operator from V to W then the matrix for 
L in the ordered bases B = (b1, b2,...) for V and B’ = (b1, fo,...) for W is 
the array of numbers mi specified by 
L(b;) = m} 2; +e Hmb; Jeges 


Remark To calculate the matrix of a linear transformation you must compute what 
the linear transformation does to every input basis vector and then write the answers 
in terms of the output basis vectors: 


((L(b1), L(b2),..., £(b;),.--) 


mi mi ml 
mM m3 m? 
= (Fi Bieta = (B11, Bas + Bayes) A seee s (B1, B2,- - -3 Bir) S ye) 
mi m3 m4 
1 1 1 
2 2 2 
mi mə mi 
= (b1, B2,- -3 Bjs) a 
J J I 
i 


Example 69 Consider L : V — R? (as in example 61) defined by 
1 0 0 
0 1 0 


By linearity this specifies the action of L on any vector from V as 


1 0 0 
Lica |l}t+toatl =(c1+c2) {1 
0 1 0 


109 


110 Matrices 


We had trouble expressing this linear operator as a matrix. Lets take input basis 


1 0 
B= Al 5 1 = (b1, b2), 
0 1 
and output basis 
1 0 0 
E = 0],{1],{0 
0 0 1 
Then 
Lb, = 0.e, + 1.eg + 0.e3 = Lbo, 
or 
0 0 0 0 
(Lb;, Lbz) = ((e1,e2,€3) | 1 | , (e1,€2,e3) | 1 | ) = (e1,e2,e3) | 1 1 
0 0 0 0 


The matrix on the right is the matrix of L in these bases. More succinctly we could 


write 
0 


Laan 


0 0 
and thus see that L acts like the matrix | 1 1 
0 0 


E 


Hence 


given input and output bases, the linear operator is now encoded by a matrix. 


This is the general rule for this chapter: 


oF Reading homework: problem 1 


Linear operators become matrices when given 


ordered input and output bases. 
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Example 70 Lets compute a matrix for the derivative operator acting on the vector 
space of polynomials of degree 2 or less: 


V = {aol + ax + a22? | ap, a1, a2 € R}. 


In the ordered basis B = (1, x, £?) we write 


a 
b =a: 1+bgr + cr? 
c) B 
and 
a [° b 
— |b =b- 1+2cz +0r? = | 2c 
dx 0 
cĉ) B B 


In the ordered basis B for both domain and range 
010 
—=]0 0 2 
0 0 0 
Notice this last equation makes no sense without explaining which bases we are using! 


7.2 Review Problems 


Reading problem len 


Webwork: Matrix of a Linear Transformation | 9, 10, 11, 12, 13 


1. A door factory can buy supplies in two kinds of packages, f and g. The 
package f contains 3 slabs of wood, 4 fasteners, and 6 brackets. The 
package g contains 5 fasteners, 3 brackets, and 7 slabs of wood. 


(a) Give a list of inputs and outputs for the functions f and g. 


(b) Give an order to the 3 kinds of supplies and then write f and g 
as elements of R°. 


(c) Let L be the manufacturing process; it takes in supply packages 
and gives out two products (doors, and door frames) and it is 
linear in supplies. If Lf is 1 door and 2 frames and Lg is 3 doors 
and 1 frame, find a matrix for L. 
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2. You are designing a simple keyboard synthesizer with two keys. If you 
push the first key with intensity a then the speaker moves in time as 
asin(t). If you push the second key with intensity b then the speaker 
moves in time as bsin(2t). If the keys are pressed simultaneously, 


(a) describe the set of all sounds that come out of your synthesizer. 
(Hint: Sounds can be “added” .) 


(b) Graph the function C c R2, 


(c) Let B = (sin(t),sin(2t)). Explain why (5) is not in Rt} but 
B 
is still a function. 


(d) Graph the function o 
B 


3. (a) Find the matrix for 4 acting on the vector space V of polynomi- 
als of degree 2 or less in the ordered basis B’ = (x, x, 1) 


se the matrix from part (a) to rewrite the differential equation 
b) Use th trix f t t ite the diff tial ti 
(x) = x as a matrix equation. Find all solutions of the matrix 


equation. Translate them into elements of V. 


(c) Find the matrix for 4 acting on the vector space V in the ordered 
basis (x? + x, x? — x, 1). 


(d) Use the matrix from part (c) to rewrite the differential equation 
“p(x) = x as a matrix equation. Find all solutions of the matrix 
equation. Translate them into elements of V. 


(e) Compare and contrast your results from parts (b) and (d). 


4. Find the “matrix” for 4 acting on the vector space of all power series 
in the ordered basis (1, x, £?, x°,...). Use this matrix to find all power 


series solutions to the differential equation Æ f(x) = x. Hint: your 
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“matrix” may not have finite size. 


5. Find the matrix for a acting on {c cos(x) + c2 sin(x)|c, c2 € R} in 
the ordered basis (cos(x), sin(x)). 


6. Find the matrix for 4 acting on {c; cosh(x) + cz sinh(x)|c1, c2 € R} in 


the ordered basis (cosh(a) + sinh(z), cosh(x) — sinh(x)). 
(Recall that the hyperbolic trigonometric functions are defined by 


cosh(x) = = sinh(x) = ==.) 


7. Let B = (1,x,2x7) be an ordered basis for 


V = {ao + az + ax" |a9, a1, a2 € R}, 


and let B’ = (x3, x?, x, 1) be an ordered basis for 


W = {ao + a£ + azz 4 azz?|ao, a1, a2, a3 E€ R}, 


Find the matrix for the operator Z : V — W defined by 


relative to these bases. 


7.3 Properties of Matrices 


The objects of study in linear algebra are linear operators. We have seen that 
linear operators can be represented as matrices through choices of ordered 
bases, and that matrices provide a means of efficient computation. 

We now begin an in depth study of matrices. 
Definition An r x k matrix M = (mi) fori ST Sl RS. 
rectangular array of real (or complex) numbers: 


1 1 1 

mi My eens My, 

2 2 2 

m m see Mm 

1 2 k 

M = , 
r r r 

Mı My sws My, 
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The numbers mi are called entries. The superscript indexes the row of the 
matrix and the subscript indexes the column of the matrix in which mi 
appears. 


An r x 1 matrix v = (vi) = (v") is called a column vector, written 


A 1x k matrix v = (vj) = (vz) is called a row vector, written 
v= (v Ug cc Uk) f 


The transpose of a column vector is the corresponding row vector and vice 
versa: 


Example 71 Let 


Then 
v =(1 2 3), 


and (vT)? =v. 
A matrix is an efficient way to store information: 


Example 72 In computer graphics, you may have encountered image files with a .gif 
extension. These files are actually just matrices: at the start of the file the size of the 
matrix is given, after which each number is a matrix entry indicating the color of a 
particular pixel in the image. 

This matrix then has its rows shuffled a bit: by listing, say, every eighth row, a web 
browser downloading the file can start displaying an incomplete version of the picture 
before the download is complete. 

Finally, a compression algorithm is applied to the matrix to reduce the file size. 


Example 73 Graphs occur in many applications, ranging from telephone networks to 
airline routes. In the subject of graph theory, a graph is just a collection of vertices 
and some edges connecting vertices. A matrix can be used to indicate how many edges 
attach one vertex to another. 
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For example, the graph pictured above would have the following matrix, where mi. 


indicates the number of edges between the vertices labeled ¿ and 7: 


eR NR 
oro w 
EP OrRFH 
wwe oore 


J 


This is an example of a symmetric matrix, since mi. = m}. 


An Adjacency Matrix Example 


The set of all r x k matrices 


M= eB begr] S lekh 


is itself a vector space with addition and scalar multiplication defined as 
follows: 


In other words, addition just adds corresponding entries in two matrices, 
and scalar multiplication multiplies every entry. Notice that M? = R” is just 
the vector space of column vectors. 
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Recall that we can multiply an r x k matrix by a k x 1 column vector to 
produce a r x 1 column vector using the rule 


k 
MV = (X miv’). 
j=1 


This suggests the rule for multiplying an r x k matrix M by ak x s 
matrix N: our k x s matrix N consists of s column vectors side-by-side, each 
of dimension k x 1. We can multiply our r x k matrix M by each of these s 
column vectors using the rule we already know, obtaining s column vectors 
each of dimension r x 1. If we place these s column vectors side-by-side, we 
obtain an r x s matrix MN. 


That is, let 
ni n ns 
ni n3 ns 
N= . . 
nî n3 ns 


ni n3 ns 
ni n ns 
N; = , N2 = à ’ , N= : 
Then 
| | | | | | 
MN=MI{N, No -:: NeJ =| MN, MN. --- MN, 


Concisely: If M = (mi) for i = 1,...,r;j = 1,...,k and N = (nå) for 
i=1,...,k;j =1,...,8, then MN = L where L = (Ë) for i = i... r; j = 
1,...,s is given by 


k 
i iP 
f= > min 
p=1 
This rule obeys linearity. 
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Notice that in order for the multiplication to make sense, the columns 
and rows must match. For an r x k matrix M and an s x m matrix N, then 
to make the product MN we must have k = s. Likewise, for the product 
NM, it is required that m = r. A common shorthand for keeping track of 
the sizes of the matrices involved in a given product is: 


(rx &) x (xm) = (r xm) 


aN Reading homework: problem 1 


Example 74 Multiplying a (3 x 1) matrix and a (1 x 2) matrix yields a (3 x 2) matrix. 
1 1-2 1-3 2 3 
3| (2 3) = (32. 343] = 6: 9 
2 2:2 2-3 4 6 


Another way to view matrix multiplication is in terms of dot products: 


The entries of MN are made from the dot products of the rows of 
M with the columns of N. 


Example 75 Let 


1 3 uT 
M=Į|3 5|= vT ad N=( i o= o 
2 6 £ 
where 
1 Y a al a al 
BS a ITee PI i gy PZU Tlo 
Then 
u-a u:b u-e 2 6 1 
MN= v-a v-b vec | =16 14 3 
w-a w-b wee 4 12 2 


This fact has an obvious yet important consequence: 
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Theorem 7.3.1. Let M be a matrix and x a column vector. If 
Mz =0 
then the vector x is orthogonal to the rows of M. 


Remark Remember that the set of all vectors that can be obtained by adding up 
scalar multiples of the columns of a matrix is called its column space . Similarly the 
row space is the set of all row vectors obtained by adding up multiples of the rows of 
a matrix. The above theorem says that if Mx = 0, then the vector x is orthogonal to 
every vector in the row space of M. 


We know that r x k matrices can be used to represent linear transforma- 
tions R* > R” via 


k 
MV = J mv’, 
j=l 


which is the same rule used when we multiply an r x k matrix by a k x 1 
vector to produce an r x 1 vector. 

Likewise, we can use a matrix N = (nå) to define a linear transformation 
of a vector space of matrices. For example 


Inve S w, 
L(M) = (É) where li, = So nimi. 
j=l 
This is the same as the rule we use to multiply matrices. In other words, 
L(M) = NM is a linear transformation. 


Matrix Terminology Let M = (mý) be a matrix. The entries mj are called 
diagonal, and the set {mj}, m3, ...} is called the diagonal of the matriz. 

Any r x r matrix is called a square matriz. A square matrix that is zero 
for all non-diagonal entries is called a diagonal matrix. An example of a 
square diagonal matrix is 


oo wb 
© U O 
> OS 
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The r x r diagonal matrix with all diagonal entries equal to 1 is called 
the identity matrix, [,., or just J. An identity matrix looks like 


LOD 
TEETE 
f=/0 0 Loo 
00 0 1 


The identity matrix is special because 
I,M = MI,=M 
for all M of size r x k. 


Definition The transpose of an r x k matrix M = (mÅ) is the k x r matrix 
with entries 


MT= (Mi) 


with mi = i 


A matrix M is symmetric if M = MT. 


Example 76 


and 


is symmetric. 


7 Reading homework: problem 2 


Observations 


e Only square matrices can be symmetric. 
e The transpose of a column vector is a row vector, and vice-versa. 


e Taking the transpose of a matrix twice does nothing. i.e., (M7)? = M. 
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Theorem 7.3.2 (Transpose and Multiplication). Let M,N be matrices such 
that MN makes sense. Then 


(MN)? = NTMT. 


The proof of this theorem is left to Review Question 2. 


7.3.1 Associativity and Non-Commutativity 


Many properties of matrices following from the same property for real num- 
bers. Here is an example. 


Example 77 Associativity of matrix multiplication. We know for real numbers x, y 
and z that 


z(yz) = (xy)z 
e., the order of bracketing does not matter. The same property holds for matrix 
multiplication, let us show why. Suppose M = (mi), N = (nt) and R = (ri F) 
are, respectively, m xX n, n x r and r x t matrices. Then from the rule for matrix 
multiplication we have 


= (Smit) and NR= (Smt) 
So first we compute 
amr (°[Somind] et) = (EE [min et) = (Lt). 


k=1 j=1 


In the first step we just wrote out the definition for matrix multiplication, in the second 
step we moved summation symbol outside the bracket (this is just the distributive 
property z(y+z) = xry + xz for numbers) and in the last step we used the associativity 
property for real numbers to remove the square brackets. Exactly the same reasoning 
shows that 


D- (Sn) = (Som foe) = (Ie met). 


This is the same as above so we are done. As a fun remark, note that Einstein would 
simply have written (MN)R = (mi, ni )rk = = mi = = m} Ltt) = M(NR). 
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Sometimes matrices do not share the properties of regular numbers. In 
particular, for generic n x n square matrices M and N, 


MNANM. | 
an Do Matrices Commute? 


Example 78 (Matrix multiplication does not commute.) 
IAA (2 1 
0 1 1 1J} (U1 
LAA- 
aie E 0 1) U By 


Since n x n matrices are linear transformations R” — R”, we can see that 
the order of successive linear transformations matters. 


Here is an example of matrices acting on objects in three dimensions that 
also shows matrices not commuting. 


On the other hand: 


Example 79 In Review Problem 3, you learned that the matrix 


m=( cos 0 ep 


—sin@ cosé 


rotates vectors in the plane by an angle 0. We can generalize this, using block matrices, 
to three dimensions. In fact the following matrices built from a 2 x 2 rotation matrix, 
a 1 x 1 identity matrix and zeroes everywhere else 


cos sinô 0 1 0 0 
M = | —sinf cos 0 and N=Į|0 cos sin], 
0 0 1 0 —sinf cosdé 


perform rotations by an angle @ in the xy and yz planes, respectively. Because, they 
rotate single vectors, you can also use them to rotate objects built from a collection of 
vectors like pretty colored blocks! Here is a picture of M and then N acting on such 
a block, compared with the case of N followed by M. The special case of 0 = 90° is 
shown. 
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Notice how the endproducts of MN and NM are different, so MN #4 NM here. 


7.3.2 Block Matrices 


It is often convenient to partition a matrix M into smaller matrices called 
blocks, like so: 


Here A = 


NBR 
korm 


2 3 
5 6|,B= ,C=(0 1 2), D= (0). 
8 9 


e The blocks of a block matrix must fit together to form a rectangle. So 


BIA CIB 
( D 5) makes sense, but (5) does not. 


ON Reading homework: problem 3 


e There are many ways to cut up an n x n matrix into blocks. Often 
context or the entries of the matrix will suggest a useful way to divide 
the matrix into blocks. For example, if there are large blocks of zeros 
in a matrix, or blocks that look like an identity matrix, it can be useful 
to partition the matrix accordingly. 
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e Matrix operations on block matrices can be carried out by treating the 
blocks as matrix entries. In the example above, 


, _ (A|B\(A|B 
w= (tp) (eD) 

_ (Æ+BC |AB+ BD 

CA+ DC | CB+ D? 


Computing the individual blocks, we get: 


30 37 44 

Ar + BG = 66 81 96 
102 127 152 
4 


10 
16 


18 
21 
24 


CB+D? = (2) 


AB+ BD 


CA+ DC 


Assembling these pieces into a block matrix gives: 


30 37 444 
66 81 96 |10 
102 127 152 |16 
4 10 16 | 2 


This is exactly M?. 


7.3.3 The Algebra of Square Matrices 


Not every pair of matrices can be multiplied. When multiplying two matrices, 
the number of rows in the left matrix must equal the number of columns in 
the right. For an r x k matrix M and an s x l matrix N, then we must 
have k = s. 
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This is not a problem for square matrices of the same size, though. 
Two n x n matrices can be multiplied in either order. For a single ma- 
trix M € M”, we can form M? = MM, M? = MMM, and so on. It is 


useful to define 
M°=T, 


the identity matrix, just like z? = 1 for numbers. 
As a result, any polynomial can be evaluated on a matrix. 


Example 80 Let f(x) = x — 2x? + 32° and 


Then: 


Hence: 


Suppose f(x) is any function defined by a convergent Taylor Series: 
f(a) = fO)+ f'O)a+ Z'O eas 
Then we can define the matrix function by just plugging in M: 
F(M) = f(0) + f'(0)M + TOM? N 


There are additional techniques to determine the convergence of Taylor Series 
of matrices, based on the fact that the convergence problem is simple for 
diagonal matrices. It also turns out that the matrix exponential 


1 1 
exp(M)=I+M+5M°+ gM +, 


always converges. 


An Matrix Exponential Example 
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7.3.4 Trace 


A large matrix contains a great deal of information, some of which often re- 
flects the fact that you have not set up your problem efficiently. For example, 
a clever choice of basis can often make the matrix of a linear transformation 
very simple. Therefore, finding ways to extract the essential information of 
a matrix is useful. Here we need to assume that n < oo otherwise there are 
subtleties with convergence that we’d have to address. 


Definition The trace of a square matrix M = (m‘) is the sum of its diagonal 
entries: 
n 
tr M = D mM; . 
i=1 


Example 81 


2 7 6 
tr|9 5 1) =2+5+8=15. 
4 3 8 


While matrix multiplication does not commute, the trace of a product of 
matrices does not depend on the order of multiplication: 


tr(MN) = (X` MİN) 


- Dyn 
l 


LENM 
l i 
tr($ NEM) 


tr(NM). 


an Proof Explanation 


Thus we have a Theorem: 


Theorem 7.3.3. 


tr(MN) =tr(NM) 


for any square matrices M and N. 
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Example 82 Continuing from the previous example, 


so 


However, tr(MN) =24+1=3=14+2=tr(NM),. 


Another useful property of the trace is that: 
tr M = tr MT 


This is true because the trace only uses the diagonal entries, which are fixed 
by the transpose. For example: 


T 
1 1 1 2 1 2 
"(3 7 =4=t(} a =e 5) : 
Finally, trace is a linear transformation from matrices to the real numbers. 


This is easy to check. 


7.4 Review Problems 


Webwork: | Reading Problems | 2<Qn, 3, 4 


1. Compute the following matrix products 


1 
12 1\ /-2 | -% 2 
4 5 2 2-2 |, (1 2 3 4 5)]3], 
78 2) \-1 2 -1 4 
5 
1 
2 1 2 1\ /-2 ¢ -3\ /121 
5 2 
3| (1 234 5), 452 2-3 §//4 5 2], 
4 78 2) \-1 2 -1) \7 8 2 
5 
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22 2 12) fi: 2. 2 oS 
21 1\ [(« 0212 1//0121 2 
(ce y z){1 2 ijy] 0121 2/]021 2 1], 
11 2) \z 021.21) 1/0 1.2.1 2 
0000 2/ \00001 

-2 $ -% 4 2 -2\ f1 2 1 

2-2 3 6 2 -#]/]4 5 2 

-1 2 -1/ \12 -# Pj \7 8 2 


2. Let’s prove the theorem (MN)? = NT MT. 


Note: the following is a common technique for proving matrix identities. 


(a) Let M = (m$) and let N = (n‘). Write out a few of the entries of 
each matrix in the form given at the beginning of section 7.3. 


(b) Multiply out MN and write out a few of its entries in the same 
form as in part (a). In terms of the entries of M and the entries 
of N, what is the entry in row į and column j of MN? 


(c) Take the transpose (MN)? and write out a few of its entries in 
the same form as in part (a). In terms of the entries of M and the 
entries of N, what is the entry in row i and column j of (MN)?? 


(d) Take the transposes NT and MT and write out a few of their 
entries in the same form as in part (a). 


(e) Multiply out N? M7 and write out a few of its entries in the same 
form as in part a. In terms of the entries of M and the entries of 
N, what is the entry in row i and column j of N?’ MT? 


(f) Show that the answers you got in parts (c) and (e) are the same. 


1 20 


3. (a) Let a= (3 = 4 


| Find AAT and ATA and their traces. 


(b) Let M be any m x n matrix. Show that MTM and MMT are 
symmetric. (Hint: use the result of the previous problem.) What 
are their sizes? What is the relationship between their traces? 
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Tı Yı 


. Let z = : and y = : be column vectors. Show that the 


In Yn 
dot product z -y = xf I y. 


An Hint 


. Above, we showed that left multiplication by an r x s matrix N was 


a linear transformation M; mar My. Show that right multiplication 


by a k x m matrix R is a linear transformation Mẹ En Må. In other 
words, show that right matrix multiplication obeys linearity. 


rN Hint 


. Let the V be a vector space where B = (v1, v2) is an ordered basis. 


Suppose 
linear 
L:V—>V 
and 
L(vi) =v +v, L(ve) = 2v1 + v2. 


Compute the matrix of L in the basis B and then compute the trace of 
this matrix. Suppose that ad — bc Æ 0 and consider now the new basis 


B’ = (avı + buz, cv; + dv). 


Compute the matrix of L in the basis B’. Compute the trace of this 
matrix. What do you find? What do you conclude about the trace 
of a matrix? Does it make sense to talk about the “trace of a linear 
transformation” ? 


. Explain what happens to a matrix when: 


(a) You multiply it on the left by a diagonal matrix? 
(b) You multiply it on the right by a diagonal matrix? 
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Give a few simple examples before you start explaining. 


8. Compute exp(A) for the following matrices: 


ZE 


9. Let M = 


OD OO Ogg KF 


0 


with one block the 4 


compute M?. 


x xo 


) 
) 
) 


Ox e 


ooocoocorF O&O 


0 


© 


oo co oo O'S 


XOocococorFacneo 


0 
0 
0 
1 
2 
0 
0 
0 
i 


Ove OF OO 


0 


wWoocTnaTaor & 


0 


Hint 


=. O OO oOoOOO Ke 


3 


. Divide M into named blocks, 


dentity matrix, and then multiply blocks to 


10. A matrix A is called anti-symmetric (or skew-symmetric) if AT = —A. 
Show that for every n x n matrix M, we can write M = A + S where 
A is an anti-symmetric matrix and S is a symmetric matrix. 


Hint: What kind of matrix is M + MT? How about M — MT? 


11. An example of an operation which is not associative is the cross prod- 


uct. 


(a) Give a simple example of three vectors from 3-space u,v, w such 


that u x (v x w) # (ux v) x w. 
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(b) We saw in chapter 1 that the operator B = ux (cross product 
with a vector) is a linear operator. It can therefore be written as 
a matrix (given an ordered basis such as the standard basis). How 
is it that composing such linear operators is non-associative even 
though matrix multiplication is associative? 


7.5 Inverse Matrix 


Definition A square matrix M is invertible (or nonsingular) if there exists 
a matrix M~t such that 


M"'M=I=M"M. 
If M has no inverse, we say M is Singular or non-invertible. 
Inverse of a 2 x 2 Matrix Let M and N be the matrices: 
u=(% a oe =) 
c d —c a 


Multiplying these matrices gives: 


ad — bc 0 
un = ( 0 ad — te) = (ad = bol. 
Then M“! = -$e (! E so long as ad — bc £ 0. 


7.5.1 Three Properties of the Inverse 


1. If A is a square matrix and B is the inverse of A, then A is the inverse 
of B, since AB = I = BA. So we have the identity: 


(Ay t= A 


2. Notice that B-!A-!AB = B-'IB=I= ABB -!A7—!. Then: 


(AB = B-1A71 
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Figure 7.1: The formula for the inverse of a 2x2 matrix is worth memorizing! 


Thus, much like the transpose, taking the inverse of a product reverses 
the order of the product. 


3. Finally, recall that (AB)? = BTAT. Since IT = I, then (A71A)? = 
A™(A1)? = I. Similarly, (AA71)? = (471)? A? = I. Then: 


(ATHT z (AT)! 


An 2x 2 Example 


7.5.2 Finding Inverses (Redux) 


Gaussian elimination can be used to find inverse matrices. This concept is 
covered in chapter 2, section 2.3.2, but is presented here again as review. 
Suppose M is a square invertible matrix and MX = V is a linear system. 
The solution must be unique because it can be found by multiplying the 
equation on both sides by M~! yielding X = M~'V. Thus, the reduced row 
echelon form of the linear system has an identity matrix on the left: 


(M |v) ~ (| MV) 
Solving the linear system MX = V then tells us what M~'V is. 
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To solve many linear systems with the same matrix at once, 
MX =V,, MX =V 


we can consider augmented matrices with many columns on the right and 
then apply Gaussian row reduction to the left side of the matrix. Once the 
identity matrix is on the left side of the augmented matrix, then the solution 
of each of the individual linear systems is on the right. 


(MV, Vz ) ~ (I| MV, MV; ) 


To compute M~!, we would like M~', rather than M~'V to appear on 
the right side of our augmented matrix. This is achieved by solving the 
collection of systems MX = ex, where ex is the column vector of zeroes with 
alinthe kth entry. I.e., the n xn identity matrix can be viewed as a bunch 
of column vectors J, = (e1 €2 -+-€,). So, putting the ex's together into an 
identity matrix, we get: 


(M | 1) ~ (Q | M) = (1 | M=) 


—1 


—1 2 —3 
Example 83 Find 2 1 0 
4 -2 5 


We start by writing the augmented matrix, then apply row reduction to the left side. 


-1 2 -8/1 0 0 —2 3/10 0 
2 1 O/0 10] ~ {0 5 -6/2 1 0 
4-2 5/001 6 -7|4 0 1 
10 jer 20 

~ 10 1 -| 2 40 

00 3] $ -$1 


j=) 
© 
po 
ioe) 
| 
aD 
Ol 
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At this point, we know M~! assuming we didn't goof up. However, row reduction 
is a lengthy and arithmetically involved process, so we should check our answer, by 
confirming that M/M~! = I (or if you prefer M~!M = T): 


=f 2 -3\ /-5 4 -3 10 0 
MM™H=| 2 1 0 10 -7 6])={01 0 
4 -2 5 8 -6 5 001 


The product of the two matrices is indeed the identity matrix, so we're done. 


"a Reading homework: problem 4 


7.5.3 Linear Systems and Inverses 


If M~! exists and is known, then we can immediately solve linear systems 
associated to M. 


Example 84 Consider the linear system: 


=z +2y =j3z= 1 
2x + y =2 
4x —2y +5z =0 
1 
The associated matrix equation is MX = | 2], where M is the same as in the 
0 
previous section. Then: 
x =i maw fl -5 4 -3\ /1 3 
y| = 2 1 0 2}/={ 10 -7 6 2] = |—4 
Z 4 —2 5 0 8 -6 5 0 —4 
x 3 
Then | y | = | —4]. In summary, when M`! exists, then 
zZ —4 


MX=VeX=M"'V. 
ae) Reading homework: problem 5 


133 


134 


Matrices 


7.5.4 Homogeneous Systems 


Theorem 7.5.1. A square matriz M is invertible if and only if the homoge- 


neous system 
MX =0 


has no non-zero solutions. 


Proof. First, suppose that M~t exists. Then MX = 0 > X = M710 = 0. 
Thus, if M is invertible, then MX = 0 has no non-zero solutions. 

On the other hand, M X = 0 always has the solution X = 0. If no other 
solutions exist, then M can be put into reduced row echelon form with every 
variable a pivot. In this case, M~! can be computed using the process in the 
previous section. 


Theorem 


M ) el a are 


In Vert, dle 


7.5.5 Bit Matrices 


In computer science, information is recorded using binary strings of data. 
For example, the following string contains an English word: 


011011000110100101101110011001010110000101110010 


A bit is the basic unit of information, keeping track of a single one or zero. 
Computers can add and multiply individual bits very quickly. 

In chapter 5, section 5.2 it is explained how to formulate vector spaces 
over fields other than real numbers. In particular, for the vectors space make 
sense with numbers Z = {0,1} with addition and multiplication given by: 


+|0 1 x|O 1 
0/0 1 0/0 0 
1|1 0 1}/0 1 
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Notice that —1 = 1, since 1+1 = 0. Therefore, we can apply all of the linear 
algebra we have learned thus far to matrices with Zs entries. A matrix with 
entries in Zə is sometimes called a bit matriz. 


101 
Example 85 |0 1 1f isan invertible matrix over Zo: 
1 1 1 


—1 


1 0 1 0 1 1 
0 1 1 =ų/1 0 1 
1 1 1 1 1 1 
This can be easily verified by multiplying: 
1 0 1 O 1 1 1 0 0 
0 1 1 1 O 1|=ļ|0 1 0 
1 1 1 1 1 1 00 1 


Application: Cryptography A very simple way to hide information is to use a sub- 
stitution cipher, in which the alphabet is permuted and each letter in a message is 
systematically exchanged for another. For example, the ROT-13 cypher just exchanges 
a letter with the letter thirteen places before or after it in the alphabet. For example, 
HELLO becomes URYYB. Applying the algorithm again decodes the message, turning 
URYYB back into HELLO. Substitution ciphers are easy to break, but the basic idea 
can be extended to create cryptographic systems that are practically uncrackable. For 
example, a one-time pad is a system that uses a different substitution for each letter 
in the message. So long as a particular set of substitutions is not used on more than 
one message, the one-time pad is unbreakable. 

English characters are often stored in computers in the ASCII format. In ASCII, 
a single character is represented by a string of eight bits, which we can consider as a 
vector in ZŠ (which is like vectors in R8, where the entries are zeros and ones). One 
way to create a substitution cipher, then, is to choose an 8 x 8 invertible bit matrix 
M, and multiply each letter of the message by M. Then to decode the message, each 
string of eight characters would be multiplied by M+. 

To make the message a bit tougher to decode, one could consider pairs (or longer 
sequences) of letters as a single vector in Z4° (or a higher-dimensional space), and 
then use an appropriately-sized invertible matrix. For more on cryptography, see “The 
Code Book,” by Simon Singh (1999, Doubleday). 


7.6 Review Problems 


Webwork: | Reading Problems | 6<21, 7 


135 


136 


Matrices 


1. Find formulas for the inverses of the following matrices, when they are 


not singular: 


l a b 
(a) {0 1 c 
00 1 
a b c 
(b) JO d e 
00 f 


When are these matrices singular? 


. Write down all 2 x2 bit matrices and decide which of them are singular. 


For those which are not singular, pair them with their inverse. 


. Let M be a square matrix. Explain why the following statements are 


equivalent: 


(a) MX =V has a unique solution for every column vector V. 
(b) M is non-singular. 
Hint: In general for problems like this, think about the key words: 


First, suppose that there is some column vector V such that the equa- 
tion MX = V has two distinct solutions. Show that M must be sin- 
gular; that is, show that M can have no inverse. 


Next, suppose that there is some column vector V such that the equa- 
tion MX = V has no solutions. Show that M must be singular. 


Finally, suppose that M is non-singular. Show that no matter what 
the column vector V is, there is a unique solution to MX = V. 


An Hint 


. Left and Right Inverses: So far we have only talked about inverses of 


square matrices. This problem will explore the notion of a left and 
right inverse for a matrix that is not square. Let 


011 
A=(7 1a) 
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(a) Compute: 
i. AAT, 
ai, (AAT), 
iii. B:= AT(AAT)™ 
(b) Show that the matrix B above is a right inverse for A, i.e., verify 


that 
AB=T. 


(c) Does BA make sense? (Why not?) 


(d) Let A be an n x m matrix with n > m. Suggest a formula for a 
left inverse C such that 
CA=I 


Hint: you may assume that ATA has an inverse. 


(e) Test your proposal for a left inverse for the simple example 


(f) True or false: Left and right inverses are unique. If false give a 
counterexample. 


An Hint 


5. Show that if the range (remember that the range of a function is the 

set of all its possible outputs) of a 3 x3 matrix M (viewed as a function 
R3 — R°) is a plane then one of the columns is a sum of multiples of the 
other columns. Show that this relationship is preserved under EROs. 
Show, further, that the solutions to Mx = 0 describe this relationship 
between the columns. 


6. If M and N are square matrices of the same size such that M~t exists 
and N~! does not exist, does (MN)! exist? 


7. If M is a square matrix which is not invertible, is exp M invertible? 
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8. Elementary Column Operations (ECOs) can be defined in the same 3 
types as EROs. Describe the 3 kinds of ECOs. Show that if maximal 
elimination using ECOs is performed on a square matrix and a column 
of zeros is obtained then that matrix is not invertible. 
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7.7 LU Redux 


Certain matrices are easier to work with than others. In this section, we 
will see how to write any square! matrix M as the product of two simpler 
matrices. We will write 

M=LU, 


where: 


e L is lower triangular. This means that all entries above the main 
diagonal are zero. In notation, L = (Ji) with I; = 0 for all j > i. 


e U is upper triangular. This means that all entries below the main 
diagonal are zero. In notation, U = (uż) with ui = 0 for all j < i. 


ui uz U3 
2 a2 
0 us u3 


M = LU is called an LU decomposition of M. 

This is a useful trick for computational reasons; it is much easier to com- 
pute the inverse of an upper or lower triangular matrix than general matrices. 
Since inverses are useful for solving linear systems, this makes solving any lin- 
ear system associated to the matrix much faster as well. The determinant—a 
very important quantity associated with any square matrix—is very easy to 
compute for triangular matrices. 


Example 86 Linear systems associated to upper triangular matrices are very easy to 


solve by back substitution. 
‘) e 1 ( =) 
S yoo, 2S |== 
e c a c 


1The case where M is not square is dealt with at the end of the section. 
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1 0 O|d 
a 1 Olje| Scz=d, y=e-ad, z = f — bd — c(e — ad) 
b cliff 


For lower triangular matrices, back substitution gives a quick solution; for upper tri- 
angular matrices, forward substitution gives the solution. 


7.7.1 Using LU Decomposition to Solve Linear Systems 
Suppose we have M = LU and want to solve the system 


MX = LUX =V. 


e Step 1: Set W = | v | = UX. 
w 


e Step 2: Solve the system LW = V. This should be simple by forward 
substitution since L is lower triangular. Suppose the solution to LW = 
V is Wo. 


e Step 3: Now solve the system UX = Wọ. This should be easy by 
backward substitution, since U is upper triangular. The solution to 
this system is the solution to the original system. 


We can think of this as using the matrix L to perform row operations on the 
matrix U in order to solve the system; this idea also appears in the study of 
determinants. 


ON Reading homework: problem 6 


Example 87 Consider the linear system: 
6r +18y+3z= 3 
2x4+12y+ z=19 
4x + 15y +3z= 0 


An LU decomposition for the associated matrix M is: 


6 18 3 300 2 6 1 
2 12 1|=|(/1 6 0 0 1 0 
4 15 3 2 3 1 0 0 1 
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u 
e Step 1: Set W = a =UX. 


w 


e Step 2: Solve the system LW = V: 


(290-9 


e Step 3: Solve the system UX = Wp. 


GEOG 


Back substitution gives z = —11,y = 3, and x = —3. 


—3 
Then X = ) and we're done. 


—11 


An Using an LU decomposition 


LU met hos tor ‘ale. siy 


O Nrite MZLU > LUX =V 
WwW 


lii) Solve LW= vV 
liii Solve UX 
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7.7.2 Finding an LU Decomposition. 


In chapter 2, section 2.3.4, Gaussian elimination was used to find LU matrix 
decompositions. These ideas are presented here again as review. 

For any given matrix, there are actually many different LU decomposi- 
tions. However, there is a unique LU decomposition in which the L matrix 
has ones on the diagonal. In that case L is called a lower unit triangular 
Matriz. 

To find the LU decomposition, we’ll create two sequences of matrices 
Lı, L2,... and U1, U2,... such that at each step, LiU; = M. Each of the Li 
will be lower triangular, but only the last U; will be upper triangular. The 
main trick for this calculation is captured by the following example: 


Example 88 (An Elementary Matrix) 


Consider 
1 0 a b e» 
z=() n u=(‘ e f a 


Lets compute EM 
a b C eee 
EM= (a a sab 74 de a 


Something neat happened here: multiplying M by E performed the row operation 
Rə —> Rə + ARı on M. Another interesting fact: 


aofi 
E ae t) 


EE =1. 
Hence M = ETEM or, writing this out 
a b ece) 1 0 a b c e 
de fj \—à 1) \d+Aa e+rAd f+àc j” 
Here the matrix on the left is lower triangular, while the matrix on the right has had 
a row operation performed on it. 


obeys (check this yourself...) 


We would like to use the first row of M to zero out the first entry of every 
row below it. For our running example, 


6 18 3 
M=|2 12 1], 
4 15 3 


142 


7.7 LU Redux 143 


so we would like to perform the row operations 
1 2 
fo > Ra — zR and Re > Ra — z Ri. 


If we perform these row operations on M to produce 


6 18 3 
U=ļ0 6 oO], 
E 


we need to multiply this on the left by a lower triangular matrix Lı so that 
the product L,U; = M still. The above example shows how to do this: Set 
L to be the lower triangular matrix whose first column is filled with minus 
the constants used to zero out the first column of M. Then 


100 
L=ļ|ġ 10 
70 


By construction L,U, = M, but you should compute this yourself as a double 
check. 

Now repeat the process by zeroing the second column of U, below the 
diagonal using the second row of U; using the row operation Ra > R3 — ER 
to produce 


6 18 3 
U:=|0 6 0 
0 0 1 
The matrix that undoes this row operation is obtained in the same way we 
found L; above and is: 
1 00 
010 
0 4 0 
Thus our answer for Lə is the product of this matrix with Lı, namely 


1 0 0 100 1 0 0 

Ig=|3 1 OF [0 1 OJ =]45 10 
1 

EA ae 


Notice that it is lower triangular because 
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The product of lower triangular matrices is always lower triangular! 


Moreover it is obtained by recording minus the constants used for all our 
row operations in the appropriate columns (this always works this way). 
Moreover, U2 is upper triangular and M = LU», we are done! Putting this 
all together we have 


6 18 3 1 0 0 6 18 3 
M= [2 12 1]|= z 1 0 
4 15 3 2 i 1 0 


If the matrix you’re working with has more than three rows, just continue 
this process by zeroing out the next column below the diagonal, and repeat 
until there’s nothing left to do. 


@ Another LU decomposition example 


The fractions in the L matrix are admittedly ugly. For two matrices 
LU, we can multiply one entire column of L by a constant A and divide the 
corresponding row of U by the same constant without changing the product 
of the two matrices. Then: 


10 0 6 18 3 
LU = | 1 0/710 6 0 
561 0 01 
10 0\ /3 0 o\ f3 9 6 18 3 
= |4 1 0] {0 6 O} JO 3 0 6 0 
23 1 0 0 1 0 0 0 0 1 
30 0\ /2 6 1 
= {1 6 0} {0 1 0 
2 31/\001 


The resulting matrix looks nicer, but isn’t in standard (lower unit triangular 
matrix) form. 
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ON Reading homework: problem 7 


For matrices that are not square, LU decomposition still makes sense. 
Given an m x n matrix M, for example we could write M = LU with L 
a square lower unit triangular matrix, and U a rectangular matrix. Then 
L will be an m x m matrix, and U will be an m x n matrix (of the same 
shape as M). From here, the process is exactly the same as for a square 
matrix. We create a sequence of matrices L; and U; that is eventually the 
LU decomposition. Again, we start with Lọ = J and Uo = M. 


Example 89 Let’s find the LU decomposition of M = Up = aa D Since M 


—4 4 1 
is a 2 x 3 matrix, our decomposition will consist of a 2 x 2 matrix and a 2 x 3 matrix. 


Then we start with Lo = Ip = t I 


The next step is to zero-out the first column of M below the diagonal. There is 
only one row to cancel, then, and it can be removed by subtracting 2 times the first 
row of M to the second row of M. Then: 


1 0 -2 1 3 
ksl, k r= ( 0 2 5) 


Since U; is upper triangular, we're done. With a larger matrix, we would just continue 
the process. 


7.7.3 Block LDU Decomposition 


Let M be a square block matrix with square blocks X, Y, Z, W such that X~! 
exists. Then M can be decomposed as a block LDU decomposition, where 
D is block diagonal, as follows: 


nD 


Then: 


I 0 X 0 I X tY 
0 W-ZX1Y 0 I l 
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This can be checked explicitly simply by block-multiplying these three ma- 
trices. 


rN Block LDU Explanation 


Example 90 For a 2 x 2 matrix, we can regard each entry as a 1 x 1 block. 


G i) =(3 1) a) 1) 


By multiplying the diagonal matrix by the upper triangular matrix, we get the standard 
LU decomposition of the matrix. 


You are now ready to attempt the first sample midterm. 


7.8 Review Problems 


Reading Problems | 7<01,8< 


Webwork LU Decomposition 14 


1. Consider the linear system: 
x! = y! 


at +a? Sie 


[Pat H3? +--+. +a" =" 


i. Find zt. 
ii. Find x?. 


iii. Find z. 
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k. Try to find a formula for «*. Don’t worry about simplifying your 
answer. 


X 


2. Let w= (7, Ww 


) be a square n x n block matrix with W invertible. 


i. If W has r rows, what size are X, Y, and Z? 


ii. Find a UDL decomposition for W. In other words, fill in the stars 
in the following equation: 


X Y\ (I *\/(* 0O\ (I 0 
ZW] \0 T/\0 «}/ \« I 
3. Show that if M is a square matrix which is not invertible then either 


the matrix matrix U or the matrix L in the LU-decomposition M = LU 
has a zero on it’s diagonal. 


4. Describe what upper and lower triangular matrices do to the unit hy- 
percube in their domain. 


5. In chapter 3 we saw that since, in general, row exchange matrices are 
necessary to achieve upper triangular form, LDPU factorization is the 
complete decomposition of an invertible matrix into EROs of various 
kinds. Suggest a procedure for using LDPU decompositions to solve 
linear systems that generalizes the procedure above. 


6. Is there a reason to prefer LU decomposition to UL decomposition, or 
is the order just a convention? 


7. If M is invertible then what are the LU, LDU, and LDPU decompo- 
sitions of M~t in terms of the decompositions for M? 


8. Argue that if M is symmetric then L = UT in the LDU decomposition 
of M. 
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Determinants 


Given a square matrix, is there an easy way to know when it is invertible? 
Answering this fundamental question is the goal of this chapter. 


8.1 The Determinant Formula 


The determinant extracts a single number from a matrix that determines 
whether its invertibility. Lets see how this works for small matrices first. 


8.1.1 Simple Examples 


For small cases, we already know when a matrix is invertible. If M isa 1x 1 
matrix, then M = (m) = M7! = (1/m). Then M is invertible if and only if 
m #0. 


For M a 2 x 2 matrix, chapter 7 section 7.5 shows that if 
mi mb} 
Mm (Oe ad) 
my mə 


1 m? —mi 
-1 2 2 
M = 1,2 ( 2 ? i 


V2 


then 


Thus M is invertible if and only if 
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Figure 8.1: Memorize the determinant formula for a 2x2 matrix! 


ta er) 
mimi = mmi #0. 


For 2 x 2 matrices, this quantity is called the determinant of M. 


1 1l 
mi m 
1 M2 
det M = det ( 5 ) = mimi — mim? . 
mi 


Example 91 For a 3 x 3 matrix, 
1 1 1 
_ 2 2 2 
M=|mi mz m|, 
3 3 3 


then—see review question 1—M is non-singular if and only if: 


det M = mim3m3 — mimêm + mimőm? — mimîm? + mimm? — mimm? F 0. 


Notice that in the subscripts, each ordering of the numbers 1, 2, and 3 occurs exactly 
once. Each of these is a permutation of the set {1, 2, 3}. 


8.1.2 Permutations 


Consider n objects labeled 1 through n and shuffle them. Each possible shuf- 
fle is called a permutation. For example, here is an example of a permutation 


of 1-5: 
METTE 
~~ |4 25 1 8 
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We can consider a permutation g as an invertible function from the set of 
numbers |n] := {1,2,...,n} to [n], so can write o(3) = 5 in the above 
example. In general we can write 


but since the top line of any permutation is always the same, we can omit it 
and just write: 
o= [o(1) a(2) o(3) a(4) o(5)| 


and so our example becomes simply o = [42513]. 
The mathematics of permutations is extensive; there are a few key prop- 
erties of permutations that we’ll need: 


e There are n! permutations of n distinct objects, since there are n choices 
for the first object, n — 1 choices for the second once the first has been 
chosen, and so on. 


e Every permutation can be built up by successively swapping pairs of 
objects. For example, to build up the permutation [3 1 2] from the 
trivial permutation [1 2 3], you can first swap 2 and 3, and then 
swap 1 and 3. 


e For any given permutation g, there is some number of swaps it takes to 
build up the permutation. (It’s simplest to use the minimum number of 
swaps, but you don’t have to: it turns out that any way of building up 
the permutation from swaps will have have the same parity of swaps, 
either even or odd.) If this number happens to be even, then ø is 
called an even permutation; if this number is odd, then o is an odd 
permutation. In fact, n! is even for all n > 2, and exactly half of the 
permutations are even and the other half are odd. It’s worth noting 
that the trivial permutation (which sends į — i for every i) is an even 
permutation, since it uses zero swaps. 


Definition The sign function is a function sgn(a) that sends permutations 
to the set {—1, 1}, defined by: 


j= 1 ifø is even; 
SEM) =) 1 ifø is odd. 
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An Permutation Example 


aN Reading homework: problem 1 


We can use permutations to give a definition of the determinant. 


Definition For an n x n matrix M, the determinant of M (sometimes writ- 
ten |M]) is given by: 


The sum is over all permutations of n. Each summand is a product of a 
single entry from each row, but with the column numbers shuffled by the 
permutation ø. 


The last statement about the summands yields a nice property of the 
determinant: 


Theorem 8.1.1. Jf M = (mi) has a row consisting entirely of zeros, then 


Moi) = 9 for every o and some i. Moreover det M = 0. 


Example 92 Because there are many permutations of n, writing the determinant this 
way for a general matrix gives a very long sum. For n = 4, there are 24 = 4! permu- 
tations, and for n = 5, there are already 120 = 5! permutations. 


For a 4 x 4 matrix, M = , then det M is: 
3 3 3 3 
mi my m3 mj 
mt m4 m4 mi 
det M = mim3m3mji— mimzm3mi — mimemim§3 


1-234 je es ees I 2 vee: 
MM MZ M4 + MM3 MAM + Mmm mg 


+ mamam3m4 + mim mm4 + 16 more terms. 
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This is very cumbersome. 

Luckily, it is very easy to compute the determinants of certain matrices. 
For example, if M is diagonal, then Mi = 0 whenever i # j. Then all 
summands of the determinant involving off-diagonal entries vanish, so: 


n 
n' 


det M = `> sgn(o)Mm amao) men) = mim2---m 


Thus: 


The determinant of a diagonal matrix is 
the product of its diagonal entries. 


Since the identity matrix is diagonal with all diagonal entries equal to 
one, we have: 
det J = 1. 


We would like to use the determinant to decide whether a matrix is in- 
vertible. Previously, we computed the inverse of a matrix by applying row 
operations. Therefore we ask what happens to the determinant when row 
operations are applied to a matrix. 


Swapping rows Lets swap rows i and j of a matrix M and then compute its determi- 
nant. For the permutation øg, let ô be the permutation obtained by swapping positions 
i and j. Clearly 

G=-o. 


Let M’ be the matrix M with rows 7 and j swapped. Then (assuming i < j): 
det M’ =} sgio) macy mag Mag Maln) 


i j n 


i j n 


= X (-sgn(ô)) mim e mia mi My 


-> sen(ô) ma mag e Miggy ME) 


— det M. 


The step replacing X`, by 5), often causes confusion; it hold since we sum over all 
permutations (see review problem 3). Thus we see that swapping rows changes the 
sign of the determinant. L.e., 

M'=-det M. 
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Figure 8.2: Remember what row swap does to determinants! 


Reading homework: problem 8.2 


Applying this result to M = I (the identity matrix) yields 
det E =-l, 


where the matrix E; is the identity matrix with rows 7 and 7 swapped. It is a row swap 
elementary matrix. 

This implies another nice property of the determinant. If two rows of the matrix 
are identical, then swapping the rows changes the sign of the matrix, but leaves the 
matrix unchanged. Then we see the following: 


Theorem 8.1.2. If M has two identical rows, then det M = 0. 


8.2 Elementary Matrices and Determinants 
In chapter 2 we found the elementary matrices that perform the Gaussian 
row operations. In other words, for any matrix M, and a matrix M’ equal 


to M after a row operation, multiplying by an elementary matrix E gave 
M = EM. 


rN Elementary Matrices [9] 


We now examine what the elementary matrices to do determinants. 


154 


8.2 Elementary Matrices and Determinants 155 


8.2.1 Row Swap 


Our first elementary matrix multiplies a matrix M by swapping rows i and j. 
Explicitly: let Rt through R” denote the rows of M, and let M’ be the matrix 
M with rows i and j swapped. Then M and M’ can be regarded as a block 
matrices (where the blocks are rows): 


R Ri 


M = : and M’ = : 
Ri R’ 
Then notice that: 
1 
R? 0 1 R’ 
M —— ER 
R 1 0 R? 
1 
The matrix 
1 
0 1 
= E; 
1 0 
1 


is just the identity matrix with rows i and j swapped. The matrix Ei is an 
elementary matrix and 


M'= EM. 
Because det / = 1 and swapping a pair of rows changes the sign of the 
determinant, we have found that 

det E; = -1. 
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Now we know that swapping a pair of rows flips the sign of the determi- 
nant so det M’ = —detM. But det Ei = —1 and M' = Ei M so 


det Ei M = det Ej det M . 


This result hints at a general rule for determinants of products of matrices. 


8.2.2 Scalar Multiply 
The next row operation is multiplying a row by a scalar: Consider 
R! 
M=| : |, 
R” 


where R’ are row vectors. Let R’(A) be the identity matrix, with the ith 
diagonal entry replaced by A, not to be confused with the row vectors. I.e., 


Then: 


equals M with one row multiplied by A. 
What effect does multiplication by the elementary matrix R'(\) have on 
the determinant? 


det M’ = ` sgn(o) mia AM a Maln) 


AX sgn(a)mgqay- e mio ++ Mem) 


= Adet M 
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Figure 8.3: Rescaling a row rescales the determinant. 


Thus, multiplying a row by À multiplies the determinant by A. Le., 
det R'(\)M = Adet M . 


Since R'(\) is just the identity matrix with a single row multiplied by 4, 
then by the above rule, the determinant of R‘(\) is A. Thus: 


det Ri(A) = det A =X, 


and once again we have a product of determinants formula: 


det R'(\)M = det R'(A) det M 


8.2.3 Row Addition 


The final row operation is adding zR/ to R'. This is done with the elementary 
matrix S* (1), which is an identity matrix but with an additional u in the 2, 7 
position: 
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1 
Then multiplying M by S* (11) performs a row addition: 


1 


What is the effect of multiplying by S* (1) on the determinant? Let M’ = 
Si(u)M, and let M” be the matrix M but with R’ replaced by RI. Then 


det M’ = >, sgn(o)maa) i (mia + umy) 1 Ma(n) 
= 2 sgn(o ma)’ Ma(i) “M5 (n) 
j n 
i >, sgn(o WMG) Mag) Ma) 
= dama pdam’ 


Since M” has two identical rows, its determinant is 0 so 
det M’ = det M, 
when M” is obtained from M by adding pz times row j to row i. 


<2 Reading homework: problem 3 
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Figure 8.4: Adding one row to another leaves the determinant unchanged. 


We also have learnt that 
det Si(u)M = det M. 
Notice that if M is the identity matrix, then we have 


det S$(u) = det(S¥(u)I) = det T =1. 


8.2.4 Determinant of Products 


In summary, the elementary matrices for each of the row operations obey 


E; = I with rows i,j swapped; det E; = -1 
Ri(A) = I with à in position i,i; det R'(A) = À 
Si(u) = IJ with u in position i,j; det Sj(u) =1 

an Elementary Determinants 


Moreover we found a useful formula for determinants of products: 


Theorem 8.2.1. If E is any of the elementary matrices Eż, R'(A), S4(u), 
then det(EM) = det E det M. 
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We have seen that any matrix M can be put into reduced row echelon form 
via a sequence of row operations, and we have seen that any row operation can 
be achieved via left matrix multiplication by an elementary matrix. Suppose 
that RREF(M) is the reduced row echelon form of M. Then 


RREF(M) = E,E---E,M, 


where each EF; is an elementary matrix. We know how to compute determi- 
nants of elementary matrices and products thereof, so we ask: 


What is the determinant of a square matrix in reduced row echelon form? 


The answer has two cases: 


1. If M is not invertible, then some row of RREF(M) contains only zeros. 
Then we can multiply the zero row by any constant without chang- 
ing M; by our previous observation, this scales the determinant of M 
by A. Thus, if M is not invertible, det RREF(M) = Adet RREF(/), 
and so det RREF(M) = 0. 


2. Otherwise, every row of RREF(M) has a pivot on the diagonal; since 
M is square, this means that RREF(M) is the identity matrix. So if 
M is invertible, det RREF(M) = 1. 


Notice that because det RREF(M) = det(E,F2---E,M), by the theorem 
above, 
det RREF(M) = det(E£,)---det(E,) det M. 


Since each F; has non-zero determinant, then det RREF(M) = 0 if and only 
if det M = 0. This establishes an important theorem: 


Theorem 8.2.2. For any square matrix M, det M # 0 if and only if M is 
invertible. 


Since we know the determinants of the elementary matrices, we can im- 
mediately obtain the following: 


An Determinants and Inverses 
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TA eorem 


M invertible &> Let MMŁO 


Figure 8.5: Determinants measure if a matrix is invertible. 


Corollary 8.2.3. Any elementary matrix Ei, R(X), Si (u) is invertible, ex- 
cept for R'(0). In fact, the inverse of an elementary matriz is another ele- 
mentary matriz. 


To obtain one last important result, suppose that M and N are square 
n x n matrices, with reduced row echelon forms such that, for elementary 
matrices F4; and F;, 


M = E, E,- -- E, RREF(M), 


and 
N = Fi Fp- -- Fi RREF(N). 


If RREF(M) is the identity matrix (i.e., M is invertible), then: 


det(MN) = det(E1Ez--- Ep RREF(M)F, F,- -- F, RREF(N)) 
E, E,- -- Eyl FF; F, RREF(N)) 


E,)---det(F;,) det (T) det(F,) -- - det(F;) det RREF (N) 


Otherwise, M is not invertible, and det M = 0 = det RREF(M). Then there 
exists a row of zeros in RREF(M), so R"(\) RREF(M) = RREF(M) for any 
A. Then: 


det(MN) = det(E1Ez--- Ep RREF(M)N) 


(Ei 
(E Ez- -- Ep RREF(M)N) 

(E,) -- - det( Ep) det(RREF(M)N) 

= det(F,)---det(F;,) det(R"(A) RREF(M)N) 
= det(E,)-+-det(E,)A det(RREF(M)N) 

= Adet(MN) 
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det MN = det M det N G 


Figure 8.6: “The determinant of a product is the product of determinants.” 


Which implies that det( M N) = 0 = det M det N. 
Thus we have shown that for any matrices M and N, 


det(M N) = det M det N 


This result is extremely important; do not forget it! 


@ Alternative proof 


ON Reading homework: problem 4 


8.3 Review Problems 


Reading Problems Leta, 2604, Sel, 4604 
Webwork: 2 x 2 Determinant 7 
Determinants and invertibility 8. 9, 10, 11 
1. Let 
mi m} m 
M=|m m må 
mi m} m3 
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Use row operations to put M into row echelon form. For simplicity, 


1 ia test 
assume that mi 4 0 A mim — mpm. 


Prove that M is non-singular if and only if: 
12-3 To 8 2) 3 123 ieee 3 122.23 
M1MoM3z — mimm + mmm — mmim; + m;mim, — mmm] £ 0 


2. (a) What does the matrix Ej = (; ‘| do to M = (‘ r) under 


left multiplication? What about right multiplication? 


(b) Find elementary matrices R!(A) and R?(A) that respectively mul- 
tiply rows 1 and 2 of M by A but otherwise leave M the same 
under left multiplication. 


(c) Find a matrix S3(A) that adds a multiple \ of row 2 to row 1 
under left multiplication. 


3. Let ô denote the permutation obtained from o by transposing the first 
two outputs, i.e. o(1) = o(2) and G(2) = o(1). Suppose the function 
f : {1,2,3,4} > R. Write out explicitly the following two sums: 


X f (o(s)) and ` f(é(s)). 
What do you observe? Now write a brief explanation why the following 
equality holds 
> Fe) =) FF), 


where the domain of the function F is the set of all permutations of n 
objects and o is related to ø by swapping a given pair of objects. 


4. Let M be a matrix and SiM the same matrix with rows 7 and 7 
switched. Explain every line of the series of equations proving that 


det M = — det( SiM). 


5. Let M’ be the matrix obtained from M by swapping two columns i 
and j. Show that det M’ = — det M. 


6. The scalar triple product of three vectors u, v, w from R? is u- (v x w). 
Show that this product is the same as the determinant of the matrix 
whose columns are u,v, w (in that order). What happens to the scalar 
triple product when the factors are permuted? 
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10. 


11. 


12; 


. Show that if M is a 3 x 3 matrix whose third row is a sum of multiples 


of the other rows (R = aR + bR) then det M = 0. Show that the 
same is true if one of the columns is a sum of multiples of the others. 


. Calculate the determinant below by factoring the matrix into elemen- 


tary matrices times simpler matrices and using the trick 
det(M) = det(E~'EM) = det(E~*) det(EM). 


Explicitly show each ERO matrix. 


det 


N AN 
N o e 
N =e © 


. Let M = and N = ia Compute the following: 


(a) det M. 

(b) det N. 

(c) det(MN). 
(d) det M det N. 
(e) det(M~') assuming ad — be # 0. 

(£) det( MT) 

(g) det(M + N) — (det M + det N). Is the determinant a linear trans- 


formation from square matrices to real numbers? Explain. 


a 


Suppose M = is invertible. Write M as a product of elemen- 


b 
d 
tary row matrices times RREF(M). 

Find the inverses of each of the elementary matrices, Ei, R’(\), Si(A). 
Make sure to show that the elementary matrix times its inverse is ac- 
tually the identity. 

Let ei denote the matrix with a 1 in the i-th row and j-th column 
and 0’s everywhere else, and let A be an arbitrary 2 x 2 matrix. Com- 
pute det(A + tI2). What is the first order term (the t! term)? Can you 
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13. 


14. 


express your results in terms of tr(A)? What about the first order term 
in det(A+tJ,,) for any arbitrary n x n matrix A in terms of tr(A)? 


Note that the result of det(A + tI) is a polynomial in the variable t 
known as the characteristic polynomial. 


(Directional) Derivative of the Determinant: 

Notice that det: M? — R where Mẹ is the vector space of all n x n 
matrices, and so we can take directional derivatives of det. Let A be 
an arbitrary n x n matrix, and for all i and 7 compute the following: 


(a) 


det (Iz + te’) — det(I2) 
im 
t30 t 
(b) | 
det(J3 + te;) — det (I3) 
t>0 t 


(c) 
det(I, + też) — det (In) 


t0 t 


(d) 
a det(I, + At) — det(I,) 


t0 t 


Note, these are the directional derivative in the ei and A directions. 
How many functions are in the set 
{f :{1,... n} > {1,... n} f exists} ? 


What about the set 
{1,... n}! men mre 


Which of these two sets correspond to the set of all permutations of n 
objects? 
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8.4 Properties of the Determinant 


We now know that the determinant of a matrix is non-zero if and only if that 
matrix is invertible. We also know that the determinant is a multiplicative 
function, in the sense that det( MN) = det M det N. Now we will devise 
some methods for calculating the determinant. 

Recall that: 


det M = ` sgn(o)Mm amao) FES Mon): 


A minor of an n x n matrix M is the determinant of any square matrix 
obtained from M by deleting one row and one column. In particular, any 
entry mi of a square matrix M is associated to a minor obtained by deleting 
the ith row and jth column of M. 

It is possible to write the determinant of a matrix in terms of its minors 
as follows: 


g! 
pats S sealg’) MME MY) 
p? 
3 n 
ap NS > sen(4 ) mgs 1M ys (a) 749 (4) 148 (n) 
g? 


Here the symbols g" refers to the permutation o with the input k removed. 
The summand on the 7’th line of the above formula looks like the determinant 
of the minor obtained by removing the first and 7’th column of M. However 
we still need to replace sum of gI by a sum over permutations of column 
numbers of the matrix entries of this minor. This costs a minus sign whenever 
j —1 is odd. In other words, to expand by minors we pick an entry mj of the 
first row, then add (—1)/~! times the determinant of the matrix with row i 
and column j deleted. An example will probably help: 
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Example 93 Let’s compute the determinant of 


using expansion by minors: 


5 6 4 6 4 5 
det M = Laet (3 5) -2e (5 5) +3det (7 :) 


= 15-0286) 9409 = 766) fase = 725) 


Here, M7! does not exist because! det M = 0. 


Example 94 Sometimes the entries of a matrix allow us to simplify the calculation 
1 2 8 

of the determinant. Take N = | 4 0 OJ]. Notice that the second row has many 
7 8 9 

zeros; then we can switch the first and second rows of N before expanding in minors 

to get: 


2. 
[e>] 
EF 

yeH 

œ Oo N 


II 
l 
Aa 
a 
© 
Ek 
ATN 
o N 
o w 
SID 


Since we know how the determinant of a matrix changes when you perform 
row operations, it is often very beneficial to perform row operations before 
computing the determinant by brute force. 


1A fun exercise is to compute the determinant of a 4 x 4 matrix filled in order, from 
left to right, with the numbers 1,2,3,...,16. What do you observe? Try the same for a 
5 x 5 matrix with 1,2,3,...,25. Is there a pattern? Can you explain it? 
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Example 95 
1 2 3 1 2 3 12 3 
det} 4 5 6] =det |3 3 3] =det |3 3 3] =0. 
7 8 9 6 6 6 0 0 0 
Try to determine which row operations we made at each step of this computation. 


You might suspect that determinants have similar properties with respect 
to columns as what applies to rows: 


Theorem 8.4.1. For any square matrix M, we have: 
det MT = det M . 


Proof. By definition, 


For any permutation øg, there is a unique inverse permutation o~! that 
undoes ø. If o sends i > j, then o~! sends j — i. In the two-line notation 
for a permutation, this corresponds to just flipping the permutation over. For 
12 3 
2 3 1p 
and then putting the columns in order: 


ral 3 =|; 2 

1 2 3 3 1 Bi 

Since any permutation can be built up by transpositions, one can also find 
the inverse of a permutation g by undoing each of the transpositions used to 
build up øg; this shows that one can use the same number of transpositions 


to build ø and o~!. In particular, sgn o = sgn o™t. 


example, if o = then we can find o~' by flipping the permutation 


<2 Reading homework: problem 5 
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Figure 8.7: Transposes leave the determinant unchanged. 


Then we can write out the above in formulas as follows: 
det M = 2 ene Maa) Mat) Mat) 


= Yo (ojm “Omg © ee mo () 


= » sen(o me Om a, oe mo) 
= » sen(o mem a wee me) 
= dit M’. 


The second-to-last equality is due to the existence of a unique inverse permu- 
tation: summing over permutations is the same as summing over all inverses 
of permutations (see review problem 4). The final equality is by the definition 
of the transpose. 


Example 96 Because of this theorem, we see that expansion by minors also works 
over columns. Let 


1 2 3 
M=|0 5 6 
0 8 9 
Then 


det M = det MT = 1 det é J =-3. 
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8.4.1 Determinant of the Inverse 


Let M and N be n x n matrices. We previously showed that 


det( MN) = det M det N, and det J = 1. 
Then 1 = det J = det(MM~) = det M det M~!. As such we have: 


Theorem 8.4.2. 
det M-! = 


det M 


Just so you don’t forget this: 


8.4.2 Adjoint of a Matrix 


Recall that for a 2 x 2 matrix 


(oa) (Ca) =a (Ea) T 


Or in a more careful notation: if 


then : : 
> int aan ves mij? 
179 ami 1 1 
mz —mi 
so long as det M = mim? — mim? 4 0. The matrix 2 | that 
19 gM _m2 1 
1 1 


appears above is a special matrix, called the adjoint of M. Let’s define the 
adjoint for an n x n matrix. 

The cofactor of M corresponding to the entry mi of M is the product of 
the minor associated to mý times (—1)'*’. This is written cofactor(m‘). 
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Definition For M = (m$) a square matrix, The adjoint matrix adj M is 
given by: l 
adj M = (cofactor(m‘))” 


Example 97 


ON Reading homework: problem 6 


Let’s multiply M adj M. For any matrix N, the i, j entry of MN is given 
by taking the dot product of the ith row of M and the jth column of N. 
Notice that the dot product of the ith row of M and the ith column of adj M 
is just the expansion by minors of det M in the ith row. Further, notice that 
the dot product of the ith row of M and the jth column of adj M with j # i 
is the same as expanding M by minors, but with the jth row replaced by the 
ith row. Since the determinant of any matrix with a row repeated is zero, 
then these dot products are zero as well. 

We know that the 2,7 entry of the product of two matrices is the dot 
product of the ith row of the first by the jth column of the second. Then: 


M adj M = (det M)I 
Thus, when det M # 0, the adjoint gives an explicit formula for M71. 


Theorem 8.4.3. For M a square matrix with det M 4 0 (equivalently, if M 
is invertible), then 


= 1 : 
M! = y M 


An The Adjoint Matrix 
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(adjM ) M= detM. I 


Example 98 Continuing with the previous example, 


3 -1 -1 2 0 2 
adj{1 2 oļ=ļ-1 3 -1 
0 1 1 r3 7 
Now, multiply: 
3 =i | 2 0 2 6 0 0 
1 2 OLlat 3 <i) = lo6 0 
0 1 1 -3 7 00 6 
3 =l <1)" TE 2 
0 1 1 1 -3 


This process for finding the inverse matrix is sometimes called Cramer’s Rule . 


8.4.3 Application: Volume of a Parallelepiped 


Given three vectors u,v, w in R3, the parallelepiped determined by the three 
vectors is the “squished” box whose edges are parallel to u,v, and w as 
depicted in Figure 8.8. 

From calculus, we know that the volume of this object is |u+(v x w)|. 


This is the same as expansion by minors of the matrix whose columns are 
u,v, w. Then: 


Volume = | det (u v w) | 
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Figure 8.8: A parallelepiped. 


8.5 Review Problems 


Reading Problems 
Row of zeros 
3 x 3 determinant 
Triangular determinants 
Expanding in a column 
Minors and cofactors 


Webwork: 


13 
14,15,16,17 
18 
19 


1. Find the determinant via expanding by minors. 


PNM om bY 


OrRrReR 


N oo B® WwW 
oo rN 


2. Even if M is not a square matrix, both MMT and MTM are square. Is 
it true that det( MMT) = det( MTM) for all matrices M? How about 


tr(MM?*) = tr(MT M)? 


173 


174 


Determinants 


Let M = ($ 
c 


b 
‘) . Show: 


1 1 
det M = 5 (tr M)” — z €M’) 


Suppose M is a 3 x 3 matrix. Find and verify a similar formula for 
det M in terms of tr( M°), tr(M°), and tr M. Hint: make an ansatz for 
your formula and derive a system of linear equations for any unknowns 
you introduce by testing it on explicit matrices. 


. Let o~' denote the inverse permutation of o. Suppose the function 


f : {1,2,3,4} > R. Write out explicitly the following two sums: 
` f(o(s)) and D f(a (s)). 


What do you observe? Now write a brief explanation why the following 


equality holds 
S Poss Me J 


where the domain of the function F is the set of all permutations of n 
objects. 


. Suppose M = LU is an LU decomposition. Explain how you would 


efficiently compute det M in this case. How does this decomposition 
allow you to easily see if M is invertible? 


. In computer science, the complexity of an algorithm is (roughly) com- 


puted by counting the number of times a given operation is performed. 
Suppose adding or subtracting any two numbers takes a seconds, and 
multiplying two numbers takes m seconds. Then, for example, com- 
puting 2 - 6 — 5 would take a + m seconds. 


(a) How many additions and multiplications does it take to compute 
the determinant of a general 2 x 2 matrix? 


(b) Write a formula for the number of additions and multiplications it 
takes to compute the determinant of a general n x n matrix using 
the definition of the determinant as a sum over permutations. 
Assume that finding and multiplying by the sign of a permutation 
is free. 
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(c) How many additions and multiplications does it take to compute 
the determinant of a general 3 x 3 matrix using expansion by 
minors? Assuming m = 2a, is this faster than computing the 
determinant from the definition? 


rN Hint 
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Subspaces and Spanning Sets 


It is time to study vector spaces more carefully and return to some funda- 
mental questions: 


1. Subspaces: When is a subset of a vector space itself a vector space? 
(This is the notion of a subspace.) 


2. Linear Independence: Given a collection of vectors, is there a way to 
tell whether they are independent, or if one is a “linear combination” 
of the others? 


3. Dimension: Is there a consistent definition of how “big” a vector space 
is? 


4. Basis: How do we label vectors? Can we write any vector as a sum of 
some basic set of vectors? How do we change our point of view from 
vectors labeled one way to vectors labeled in another way? 


Let’s start at the top! 


9.1 Subspaces 
Definition We say that a subset U of a vector space V is a subspace of V 


if U is a vector space under the inherited addition and scalar multiplication 
operations of V. 
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Example 99 Consider a plane P in R? through the origin: 


ax + by+cz=0. 


This equation can be expressed as the homogeneous system (a b c) y | =0, or 
Zz 

MX = 0 with M the matrix (a b c). If X and Xz are both solutions to MX = 0, 

then, by linearity of matrix multiplication, so is 1X1 + vX2: 


M (px, + vX2) = uMXı +vMX: =0. 


So P is closed under addition and scalar multiplication. Additionally, P contains the 
origin (which can be derived from the above by setting u = v = 0). All other vector 
space requirements hold for P because they hold for all vectors in R°. 


Theorem 9.1.1 (Subspace Theorem). Let U be a non-empty subset of a 
vector space V. Then U is a subspace if and only if pui + vuz € U for 
arbitrary uz, u2 in U, and arbitrary constants u,v. 


Proof. One direction of this proof is easy: if U is a subspace, then it is a vector 
space, and so by the additive closure and multiplicative closure properties of 
vector spaces, it has to be true that wu, + vus € U for all u1, ug in U and all 
constants constants u, V. 

The other direction is almost as easy: we need to show that if uu +vug € 
U for all u1, ug in U and all constants u,v, then U is a vector space. That 
is, we need to show that the ten properties of vector spaces are satisfied. 
We know that the additive closure and multiplicative closure properties are 
satisfied. All of the other eight properties is true in U because it is true 
in V. 
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Note that the requirements of the subspace theorem are often referred to as 
“closure”. 

We can use this theorem to check if a set is a vector space. That is, if we 
have some set U of vectors that come from some bigger vector space V, to 
check if U itself forms a smaller vector space we need check only two things: 


1. If we add any two vectors in U, do we end up with a vector in U? 


2. If we multiply any vector in U by any constant, do we end up with a 
vector in U? 


If the answer to both of these questions is yes, then U is a vector space. If 
not, U is not a vector space. 


rays Reading homework: problem 1 


9.2 Building Subspaces 


Consider the set 


1 0 
U = ol, {1 c R°. 
0 0 


Because U consists of only two vectors, it clear that U is not a vector space, 
since any constant multiple of these vectors should also be in U. For example, 
the 0-vector is not in U, nor is U closed under vector addition. 

But we know that any two vectors define a plane: 
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In this case, the vectors in U define the ry-plane in R3. We can view the 
xy-plane as the set of all vectors that arise as a linear combination of the two 
vectors in U. We call this set of all linear combinations the span of U: 


1 0 
span(U)= 4 xz|0|+y|1]||r,yEeER 
0 0 
Notice that any vector in the xy-plane is of the form 
T 1 0 
y| =xz|0] +y |1] €span(U). 
0 0 0 


Definition Let V be a vector space and S = {s1, 52,...} C V a subset of V. 
Then the span of S is the set: 


span(S) := {r'sy + r°sg+---+r%sylr' ER, N €N}. 


That is, the span of S is the set of all finite linear combinations! of 
elements of S. Any finite sum of the form “a constant times sı plus a constant 
times s2 plus a constant times s3 and so on” is in the span of S. 


It is important that we only allow finite linear combinations. In the definition 
above, N must be a finite number. It can be any finite number, but it must 
be finite. 


0 
Example 100 Let V = R? and X C V be the z-axis. Let P = | 1 |, and set 
0 
=XU 
2 2 2 0 
The vector | 3 | is in span(S), because | 3 | = | 0} +31] 1] . Similarly, the vector 
0 0 0 0 
—12 —12 —12 0 
17.5 | isin span(S), because | 17.5 | = 0 | +17.5 | 1 | . Similarly, any vector 
0 0 0 0 


!Usually our vector spaces are defined over R, but in general we can have vector spaces 
defined over different base fields such as C or Za. The coefficients rê should come from 
whatever our base field is (usually R). 
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of the form 


An 


is in span(S). On the other hand, any vector in span(S) must have a zero in the 
z-coordinate. (Why?) 

So span(S') is the xy-plane, which is a vector space. (Try drawing a picture to 
verify this!) 


"afa Reading homework: problem 2 


Lemma 9.2.1. For any subset S C V, span(S) is a subspace of V. 


Proof. We need to show that span(S) is a vector space. 

It suffices to show that span(S) is closed under linear combinations. Let 
u,v € span(S) and A, u be constants. By the definition of span(S), there are 
constants c’ and d’ (some of which could be zero) such that: 


u = dsi +s+ 
v = d's +s+- 
=> \u + pv Me's; + sg +---)+pl(d's; + dsa) 
= (Ac + pd')s1 + (Ae + pd?)so +- 


This last sum is a linear combination of elements of S, and is thus in span(S). 
Then span(S) is closed under linear combinations, and is thus a subspace 
of V. 


Note that this proof, like many proofs, consisted of little more than just 
writing out the definitions. 


Example 101 For which values of a does 


0.0) 


x 
Given an arbitrary vector () in R3, we need to find constants r1, r?, r° such that 
z 
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1 1 a T 
rijol+?2{ 2)4r2f1]= fy 
3 0 zZ 
We can write this as a linear system in the unknowns rt, r?°, r3 as follows: 
1 1 a rl x 
0 2 1 rt=ly 
a —3 0 r3 z 
1 l a 
If the matrix M = | 0 2 1f is invertible, then we can find a solution 
a —3 0 
z rl 
M! y| = |r? 
z 
x 
for any vector | y | € R°. 
z 


Therefore we should choose a so that M is invertible: 


i.e., 0 Æ det M = —2a? +3 +a = —(2a — 3)(a + 1). 
Then the span is R if and only if a 4 —1, z 


An Linear systems as spanning sets 


Some other very important ways of building subspaces are given in the 
following examples. 


Example 102 (The kernel of a linear map). 

Suppose L : U + V is a linear map between vector spaces. Then if 
L(u)=0 = Le), 

linearity tells us that 


L(au + Bu’) = aL(u) + BL(u') = a0 + 80 = 0. 
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Hence, thanks to the subspace theorem, the set of all vectors in U that are mapped 
to the zero vector is a subspace of V. It is called the kernel of L: 


kerL := {u € U|L(u) = 0} CU. 
Note that finding kernels is a homogeneous linear systems problem. 
Example 103 (The image of a linear map). 
Suppose L : U — V is a linear map between vector spaces. Then if 
v = L(u) and v' = L(u’), 
linearity tells us that 
av + Bv' =aL(u) + BL(u’) = L(au + Bu’). 


Hence, calling once again on the subspace theorem, the set of all vectors in V that 
are obtained as outputs of the map L is a subspace. It is called the image of L: 


imL := {L(u) uce U} CV. 
Example 104 (An eigenspace of a linear map). 
Suppose L : V > V is a linear map and V is a vector space. Then if 
L(u) = Au and L(v) = Av, 
linearity tells us that 
L(au + Bv) = aL(u) + BL(v) = aL(u) + BL(v) = adu + BAv = Alau + v). 


Hence, again by subspace theorem, the set of all vectors in V that obey the ezgenvector 
equation L(v) = Xv is a subspace of V. It is called an eigenspace 


Vy := {v € VIL(v) = Av}. 


For most scalars A, the only solution to L(v) = Av will be v = 0, which yields the 
trivial subspace {0}. When there are nontrivial solutions to L(v) = Av, the number A 
is called an eigenvalue, and carries essential information about the map L. 


Kernels, images and eigenspaces are discussed in great depth in chap- 
ters 16 and 12. 
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Subspaces and Spanning Sets 


9.3 Review Problems 


Webwork: Subspaces 5, 4,5, 6 
Spans 8 


Reading Problems | 1, 2 


1. Determine if x — £? € span{z?, 2x + x°, £ + 2°}. 


2. Let U and W be subspaces of V. Are: 


(a) UUW 
(b) UAW 


also subspaces? Explain why or why not. Draw examples in Rê. 


An Hint 


3. Let L: R? > R? where 


L(x,y,z) = (x + 2y + z, 2x 


Find kerL, imŻL and eigenspaces R4, 


R3. 


Your answers should be 


subsets of R3. Express them using the span notation. 
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Linear Independence 


Consider a plane P that includes the origin in R? and a collection {u, v, w} 


of non-zero vectors in P: 


If no two of u,v and w are parallel, then P = span{u,v,w}. But any two 
vectors determines a plane, so we should be able to span the plane using 
only two of the vectors u,v,w. Then we could choose two of the vectors in 
{u, v, w} whose span is P, and express the other as a linear combination of 
those two. Suppose u and v span P. Then there exist constants d', d? (not 
both zero) such that w = dtu + d?v. Since w can be expressed in terms of u 
and v we say that it is not independent. More generally, the relationship 


cutcv+tcw =0 


ce 


R, some c' £0 


expresses the fact that u,v, w are not all independent. 
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Linear Independence 


Definition We say that the vectors v1, v2,...,Un are linearly dependent if 
there exist constants! c',c?,...,c”™ not all zero such that 


clu, + fun +--+ +c Un = 0. 
Otherwise, the vectors v1, V2,...,Un are linearly independent. 


Remark The zero vector Oy can never be on a list of independent vectors because 
aOy = Oy for any scalar a. 


Example 105 Consider the following vectors in R: 


4 —3 5 —1 
v = —1 ; v2 = 7 5 U3 = 12 5 U4 = 1 
4 17 0 


Are these vectors linearly independent? 
No, since 3v1 + 2v2 — v3 + v4 = 0, the vectors are linearly dependent. 


An Worked Example 


10.1 Showing Linear Dependence 


In the above example we were given the linear combination 3v1 + 2v — v3 + v4 
seemingly by magic. The next example shows how to find such a linear 
combination, if it exists. 


Example 106 Consider the following vectors in R: 


0 1 1 
y= |0], vg = [2], v3 = | 2 
1 1 3 


Are they linearly independent? 
We need to see whether the system 


clu, + eu + ev = 0 


'Usually our vector spaces are defined over R, but in general we can have vector spaces 
defined over different base fields such as C or Zə. The coefficients ¢ should come from 
whatever our base field is (usually R). 


186 


10.1 Showing Linear Dependence 


187 


has any solutions for c!,c?,c?. We can rewrite this as a homogeneous system by 


building a matrix whose columns are the vectors v1, v2 and v3: 
c! 
(vi v2 v3) e| =0. 
3 


This system has solutions if and only if the matrix M = (vi v2 v3) is singular, so 
we should find the determinant of M: 


0 1 1 11 
det M =det {0 2 2] = det G 7 = 0. 
1 1 3 


Therefore nontrivial solutions exist. At this point we know that the vectors are 
linearly dependent. If we need to, we can find coefficients that demonstrate linear 
dependence by solving the system of equations: 


0 1 10 1 1 3/0 1 0 2/0 
0 22;0)}~10 11;);0)J~70 1 10 
1 1 3/0 0 0 00 0 0 0/0 
Then c? = c? =: u, Ê = —p, and ct = —2u. Now any choice of u will produce 
coefficients ct, c?, c? that satisfy the linear equation. So we can set js = 1 and obtain: 


ctu + evo + Cvs = 0 > —2v, — v + 03 = 0. 


aN Reading homework: problem 1 


Theorem 10.1.1 (Linear Dependence). An ordered set of non-zero vectors 
(U1,.--;Un) is linearly dependent if and only if one of the vectors vz is ez- 
pressible as a linear combination of the preceding vectors. 


Proof. The theorem is an if and only if statement, so there are two things to 
show. 


k 


i. First, we show that if v = ctu, + -+ -c47 tug— then the set is linearly 


dependent. 


This is easy. We just rewrite the assumption: 


clup tee H eT Wp — Uk + OUy +e + Ow = 0. 


This is a vanishing linear combination of the vectors {v1,..., Un} with 
not all coefficients equal to zero, so {v1,...,Un} is a linearly dependent 
set. 
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ii. Now, we show that linear dependence implies that there exists k for 


which vx is a linear combination of the vectors {v1,...,vg—1}. 


The assumption says that 
1 2 Tg 
C U1 + CUa +- H cv, = 0. 


Take k to be the largest number for which c is not equal to zero. So: 


k-1 


clu, + eva +--+ PR + cry, ='(), 


(Note that k > 1, since otherwise we would have c!v; = 0 > vı = 0, 
contradicting the assumption that none of the v; are the zero vector.) 


As such, we can rearrange the equation: 


clu, teu t--- +c tu, = lo 
n a 2 cel 
Ul 2 0 kk 
ck ck ck 


Therefore we have expressed vu; as a linear combination of the previous 
vectors, and we are done. 


An Worked proof 


Example 107 Consider the vector space P2(t) of polynomials of degree less than or 
equal to 2. Set: 


vu = l+t 
vw = 1+? 
v = t+? 
v = 2+t+ť 
ve = 14t+4+t?. 
The set {v1,...,U5} is linearly dependent, because v4 = v1 + v2. 
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10.2 Showing Linear Independence 


We have seen two different ways to show a set of vectors is linearly dependent: 
we can either find a linear combination of the vectors which is equal to 
zero, or we can express one of the vectors as a linear combination of the 
other vectors. On the other hand, to check that a set of vectors is linearly 
independent, we must check that every linear combination of our vectors 
with non-vanishing coefficients gives something other than the zero vector. 


Equivalently, to show that the set v1, v2,...,Un, is linearly independent, we 
must show that the equation civ + C2U2 +--+ + CnUn = 0 has no solutions 
other than cy = c2 = +- = Cph = 0. 


Example 108 Consider the following vectors in R3: 


0 2 1 
v= {|0], vg=|2], v3= 14 
2 1 3 


Are they linearly independent? 
We need to see whether the system 


ctu -+ C02 + v3 =) 


has any solutions for c!,c*,c?. We can rewrite this as a homogeneous system: 
(v1 v2 v3) e| =0. 


This system has solutions if and only if the matrix M = (v1 v2 v3) is singular, so 
we should find the determinant of M: 


0 2 1 "E 
de M =det|0 2 4 = 2aet ( ) =12, 
2 1 3 


Since the matrix M has non-zero determinant, the only solution to the system of 
equations 
a 
(v1 V2 v3) Ce = 0 


fou 


is cy = Co = C3 = 0. So the vectors vj, v2, v3 are linearly independent. 


7 Reading homework: problem 2 
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Linear Independence 


10.3 From Dependent Independent 
Now suppose vectors U1,...,Un are linearly dependent, 
clui + ev +- tie. =0 


with c! 4 0. Then: 


span{vi,...,Un} = span{vo,...,Un} 
because any xz € span{v1,...,Un} is given by 
r = alw +t- aUn 
a, e e 
= Wg Se — —Un | +a v ṢA Ha Un 
Cy Cy 


II 
TN 
Q 
N 
| 
Q 
Ss 
Nn 
Ss 
bo 
+ 
+ 
i 
Q 
3 
| 
Sa 
e| 
“S 
is 
3 


Then x is in span{vo,..., Un}. 

When we write a vector space as the span of a list of vectors, we would like 
that list to be as short as possible (this idea is explored further in chapter 11). 
This can be achieved by iterating the above procedure. 


Example 109 In the above example, we found that v4 = vı + və. In this case, 
any expression for a vector as a linear combination involving v4 can be turned into a 
combination without v4 by making the substitution v4 = v1 + v2. 

Then: 


S = span{l +t, 1 +t, tHE, 2+t+t,1+t+t} 
= span{1 +t, 1+¢,t+t,1+t+t}. 


Now we notice that 1 + t+ t? = 4(1 + t) + 5(1+ 127) + 5(t + t?). So the vector 
1+t+t? = vs is also extraneous, since it can be expressed as a linear combination of 
the remaining three vectors, v1, v2, v3. Therefore 


S = span{1 +t, 1 +t, ee}. 
In fact, you can check that there are no (non-zero) solutions to the linear system 
HIHHH) +EH) =0. 


Therefore the remaining vectors {1 + t,1 + t,t + t?} are linearly independent, and 
span the vector space S. Then these vectors are a minimal spanning set, in the sense 
that no more vectors can be removed since the vectors are linearly independent. Such 
a set is called a basis for S. 
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Example 110 Let Z3 be the space of 3 x 1 bit-valued matrices (i.e., column vectors). 
Is the following subset linearly independent? 


1 1 0 
1l, {o],{1 
0 1 1 


1 1 0 
c{1}4+c[o0}) +c [1] =0, 
0 1 1 


which becomes the linear system 


Orr 


1 0 
0 1 e| =0. 
1 1 2 


Solutions exist if and only if the determinant of the matrix is non-zero. But: 


1 1 0 
det {1 0 1] =1det Pa — 1 det n =—]-—-1=1+1=0 
011 1 1 0 1 


Therefore non-trivial solutions exist, and the set is not linearly independent. 


10.4 Review Problems 


Reading Problems Iaa 2e 
Webwork: Testing for linear independence 3,4 
Gaussian elimination 5 
Spanning and linear independence 6 


1. Let B” be the space of n x 1 bit-valued matrices (7.e., column vectors) 
over the field Zə. Remember that this means that the coefficients in 
any linear combination can be only 0 or 1, with rules for adding and 
multiplying coefficients given here. 


(a) How many different vectors are there in B”? 


(b) Find a collection S of vectors that span B? and are linearly inde- 
pendent. In other words, find a basis of Bè. 
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(c) Write each other vector in B? as a linear combination of the vectors 
in the set S that you chose. 


(d) Would it be possible to span B? with only two vectors? 


AN Hint 


2. Let e; be the vector in R” with a 1 in the ith position and 0’s in every 
other position. Let v be an arbitrary vector in R”. 


(a) Show that the collection {e1,...,en} is linearly independent. 
(b) Demonstrate that v = $}; (v + e;)ei. 


(c) The span{e1,..., €n} is the same as what vector space? 


3. Consider the ordered set of vectors from R3 


(a) Determine if the set is linearly independent by using the vectors 
as the columns of a matrix M and finding RREF (M). 


(b) If possible, write each vector as a linear combination of the pre- 
ceding ones. 


(c) Remove the vectors which can be expressed as linear combinations 
of the preceding vectors to form a linearly independent ordered set. 
(Every vector in your set set should be from the given set.) 


4. Gaussian elimination is a useful tool figure out whether a set of vectors 
spans a vector space and if they are linearly independent. Consider a 
matrix M made from an ordered set of column vectors (v1, V2, . .., Um) C 

R” and the three cases listed below: 


(a) RREF(M) is the identity matrix. 
(b) RREF(M) has a row of zeros. 
(c) Neither case i or ii apply. 
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First give an explicit example for each case, state whether the col- 
umn vectors you use are linearly independent or spanning in each case. 
Then, in general, determine whether (v1, v2,...,Um) are linearly inde- 
pendent and/or spanning R” in each of the three cases. If they are 
linearly dependent, does RREF(M) tell you which vectors could be 
removed to yield an independent set of vectors? 
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Basis and Dimension 


In chapter 10, the notions of a linearly independent set of vectors in a vector 
space V, and of a set of vectors that span V were established: Any set of 
vectors that span V can be reduced to some minimal collection of linearly 
independent vectors; such a set is called a basis of the subspace V. 


Definition Let V be a vector space. Then a set S is a basis for V if S is 
linearly independent and V = span S. 

If S is a basis of V and S has only finitely many elements, then we say 
that V is finite-dimensional. The number of vectors in S is the dimension 
of V. 


Suppose V is a finite-dimensional vector space, and S and T are two dif- 
ferent bases for V. One might worry that S and T have a different number of 
vectors; then we would have to talk about the dimension of V in terms of the 
basis S or in terms of the basis T. Luckily this isn’t what happens. Later in 
this chapter, we will show that S and T must have the same number of vec- 
tors. This means that the dimension of a vector space is basis-independent. 
In fact, dimension is a very important characteristic of a vector space. 


Example 111 P,,(t) (polynomials in t of degree n or less) has a basis {1,t,...,#”}, 
since every vector in this space is a sum 
Pit@iza--ta®”, ER, 


so P,(t) = span{1,t,...,t”}. This set of vectors is linearly independent: If the 
polynomial p(t) = c°1+ ctt +--+ et” = 0, then £ = ct =--- =c* = 0, so p(t) is 
the zero polynomial. Thus P,,(t) is finite dimensional, and dim P,(t) =n + 1. 
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Theorem 11.0.1. Let S = {v1,...,Un} be a basis for a vector space V. 
Then every vector w E€ V can be written uniquely as a linear combination of 
vectors in the basis S: 


w = ctu +- $C 'Un. 


Proof. Since S is a basis for V, then span S = V, and so there exist con- 
stants c’ such that w = clu; +--+: + Un. 
Suppose there exists a second set of constants d? such that 


w = d'ui +- + don. 
Then: 


Oy = w-w 
= cu t- +c", — du, — +++ do 


= (d — dwi ++: + (e — d")up. 


If it occurs exactly once that c’ # df, then the equation reduces to 0 = 
(č — d')v;, which is a contradiction since the vectors v; are assumed to be 
non-zero. 

If we have more than one i for which c’ ¥ dt, we can use this last equation 
to write one of the vectors in S as a linear combination of other vectors in S, 
which contradicts the assumption that S is linearly independent. Then for 
every i, £ = d'. 


An Proof Explanation 


Remark This theorem is the one that makes bases so useful—they allow us to convert 
abstract vectors into column vectors. By ordering the set S we obtain B = (v1,..., Un) 
and can write 


C Cc 


w = (v1, Un) | = |= 


n n 
C C B 


Remember that in general it makes no sense to drop the subscript B on the column 
vector on the right-most vector spaces are not made from columns of numbers! 
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rN Worked Example 


Next, we would like to establish a method for determining whether a 
collection of vectors forms a basis for R”. But first, we need to show that 
any two bases for a finite-dimensional vector space has the same number of 
vectors. 


Lemma 11.0.2. If S = {u1,...,Un} is a basis for a vector space V and 
T = {wi,...,Wm} is a linearly independent set of vectors in V, then m < n. 


The idea of the proof is to start with the set S and replace vectors in S 
one at a time with vectors from T, such that after each replacement we still 
have a basis for V. 


Nn Reading homework: problem 1 


Proof. Since S spans V, then the set {w1,v1,...,Un} is linearly dependent. 
Then we can write w as a linear combination of the v;; using that equation, 
we can express one of the v; in terms of w; and the remaining vj with j 4 
i. Then we can discard one of the v; from this set to obtain a linearly 
independent set that still spans V. Now we need to prove that Sı is a basis; 
we must show that Sj is linearly independent and that Sı spans V. 

The set Sı = {w1,U1,.--,Vi-1, Vi+1; ---, Un} is linearly independent: By 
the previous theorem, there was a unique way to express w, in terms of 
the set S. Now, to obtain a contradiction, suppose there is some k and 
constants c’ such that 


0 1 i—1 i+1 
Up =C WwW +O t+ +e vi- +AT Viga He H Un. 


Then replacing wı with its expression in terms of the collection S gives a way 
to express the vector vz as a linear combination of the vectors in S, which 
contradicts the linear independence of S. On the other hand, we cannot 
express w as a linear combination of the vectors in {v;|j # i}, since the 
expression of w, in terms of S was unique, and had a non-zero coefficient for 
the vector v;. Then no vector in Sı can be expressed as a combination of 
other vectors in S1, which demonstrates that Sı is linearly independent. 
The set Sı spans V: For any u € V, we can express u as a linear com- 
bination of vectors in S. But we can express v; as a linear combination of 
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vectors in the collection S1; rewriting v; as such allows us to express u as 
a linear combination of the vectors in S;. Thus Sj is a basis of V with n 
vectors. 

We can now iterate this process, replacing one of the v; in Sı with wa, 


and so on. If m < n, this process ends with the set Sm = {wj,...,Wm, 
Viz) +++) Uin-m}, Which is fine. 
Otherwise, we have m > n, and the set S, = {w1,..., Wn} is a basis 


for V. But we still have some vector w,4, in T that is not in Sp. Since Sh 
is a basis, we can write Wn+ı as a combination of the vectors in Sn, which 
contradicts the linear independence of the set T. Then it must be the case 
that m < n, as desired. 


AN Worked Example 


Corollary 11.0.3. For a finite-dimensional vector space V, any two bases 
for V have the same number of vectors. 


Proof. Let S and T be two bases for V. Then both are linearly independent 
sets that span V. Suppose S has n vectors and T has m vectors. Then by 
the previous lemma, we have that m < n. But (exchanging the roles of S 
and T in application of the lemma) we also see that n < m. Then m = n, 
as desired. 


eo) Reading homework: problem 2 


11.1 Bases in R”. 


In review question 2, chapter 10 you checked that 


1 0 0 
0 1 0 

R” = span ’ : ’ ’ : ’ 
0 0 1 


and that this set of vectors is linearly independent. (If you didn’t do that 
problem, check this before reading any further!) So this set of vectors is 
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a basis for R”, and dimR” = n. This basis is often called the standard 
or canonical basis for R”. The vector with a one in the ith position and 
zeros everywhere else is written e;. (You could also view it as the function 
{1,2,...,n} => R where e;(j) = 1 if i = j and 0 if i Æ j.) It points in the 
direction of the ith coordinate axis, and has unit length. In multivariable 
calculus classes, this basis is often written {i, j, k} for R3. 

Note that it is often convenient to order basis elements, so rather than 
writing a set of vectors, we would write a list. This is called an ordered 
basis. For example, the canonical ordered basis for R” is (e1, €2,...,€n). The 
possibility to reorder basis vectors is not the only way in which bases are 
non-unique: 


Bases are not unique. While there exists a unique way to express a vector in terms 
of any particular basis, bases themselves are far from unique. For example, both of 


the sets: l (:) | 3 } and { C) &) } 


are bases for R?. Rescaling any vector in one of these sets is already enough to show 
that R? has infinitely many bases. But even if we require that all of the basis vectors 
have unit length, it turns out that there are still infinitely many bases for RÊ? (see 
review question 3). 


To see whether a collection of vectors S = {v1,...,Um} is a basis for R”, 
we have to check that they are linearly independent and that they span R”. 
From the previous discussion, we also know that m must equal n, so lets 
assume S has n vectors. If S is linearly independent, then there is no non- 
trivial solution of the equation 


0 = gtv +++ + 2" vp. 


Let M be a matrix whose columns are the vectors v; and X the column 
vector with entries x’. Then the above equation is equivalent to requiring 
that there is a unique solution to 


MX =0. 


To see if S spans R”, we take an arbitrary vector w and solve the linear 
system 


w = gw H H Un 
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in the unknowns zt. For this, we need to find a unique solution for the linear 
system MX = w. 
Thus, we need to show that M~! exists, so that 


X=M'w 


is the unique solution we desire. Then we see that S is a basis for V if and 
only if det M £ 0. 


Theorem 11.1.1. Let S = {v1,...,Um} be a collection of vectors in R”. 
Let M be the matrix whose columns are the vectors in S. Then S is a basis 
for V if and only ifm is the dimension of V and 


det M #0. 
Remark Also observe that 5 is a basis if and only if RREF(M) =I. 


Example 112 Let 


sa a 


l J Since det Ms = 1 Æ 0, then S is a basis for R?. 


Then set Ms = 1 


Likewise, set Mr = G 4 


1 1 . f . 
). Since det Mr = —2 £0, then T is a basis for R?. 


11.2 Matrix of a Linear Transformation (Redux) 


Not only do bases allow us to describe arbitrary vectors as column vectors, 
they also permit linear transformations to be expressed as matrices. This 
is a very powerful tool for computations, which is covered in chapter 7 and 
reviewed again here. 

Suppose we have a linear transformation L: V — W and ordered input 
and output bases FE = (e1,...,€n) and F = (fi,..., fm) for V and W re- 
spectively (of course, these need not be the standard basis—in all likelihood 
V is not R”). Since for each e;, L(e;) is a vector in W, there exist unique 
numbers m$ such that 


Le) = hm bo Jant = yey a) 
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The number mi is the ith component of L(e;) in the basis F, while the f; 
are vectors (note that if œ is a scalar, and v a vector, av = va, we have 
used the latter—rather uncommon—notation in the above formula). The 
numbers m$ naturally form a matrix whose jth column is the column vector 


j 
displayed above. Indeed, if 


V = eu! +-+-+ env", 
Then 


L(v) = L(v'e, + ves +--+ + uen) 


= vtL(e1) +v’ Lle) +- +u”Llen) = L(e;)v? 
j=l 
= fim} +--+ famy yo - Yi yom J 
j=1 j=l 
mt ms ++) mi v! 
2 2 2 
m? m v 
= (fi fa fa) | 
m? srs mie u” 


In the column vector-basis notation this equality looks familiar: 


1 1 1 1 
Uv MI -Ma Uv 
LE): | =|] : 
n m m n 
va mī my vu m 


The array of numbers M = (mi) is called the matrix of L in the input and 
output bases E and F for V and W, respectively. This matrix will change 
if we change either of the bases. Also observe that the columns of M are 
computed by examining L acting on each basis vector in V expanded in the 


basis vectors of W. 


Example 113 Let L: P,(t) => Pı(t), such that L(a + bt) = (a + b)t. Since V = 
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P,(t) = W, let's choose the same ordered basis B = (1 — t,1 + t) for V and W. 


L(-t) = (= Dt= 0 = (1-9) -04048)-0= (1-41 +2) (6) 
L(1+t) = (14+1)t=2t =(1-t) 1+(1+4)-1= (1-414 (4) 
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0 =] a 
ARG 
b), \\o 1) \d/), 
When the vector space is R” and the standard basis is used, the problem 


of finding the matrix of a linear transformation will seem almost trivial. It 
is worthwhile working through it once in the above language though: 


Example 114 Any vector in R” can be written as a linear combination of the standard 
(ordered) basis (€1,...@n). The vector e; has a one in the ith position, and zeros 
everywhere else. T.e. 


1 0 
1 0 

€ = , eg = ; , En = 
0 0 1 


Then to find the matrix of any linear transformation L: R” — R”, it suffices to know 
what L(e;) is for every i. 

For any matrix M, observe that Me; is equal to the ith column of M. Then if the 
ith column of M equals L(e;) for every i, then Mv = L(v) for every v € R”. Then 
the matrix representing L in the standard basis is just the matrix whose ith column 
is L(e;). 

For example, if 


1 1 0 2 0 
Lio)={4], Lrl1i)= [5], zļoļ= 
0 7 0 8 1 


Om w 


then the matrix of L in the standard basis is simply 


LP 2 
4 5 
7 8 


OM w 
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Alternatively, this information would often be presented as 


x t+ 2y4+ 3z 
L| y] =| 4%+5y+ 62 
z 7x + 8y + 9z 
You could either rewrite this as 
x 12 3 x 
Liy|= {4 5 6 yl, 
Z 7 8 9 z 


to immediately learn the matrix of L, or taking a more circuitous route: 


x 1 0 0 
Liy) = Lix{|O}]+yl{o0}4+2z 40 
z 0 1 1 

1 2 3 1 2 3\ /z 

= £{4}/+y]5]+2z]/6] = 14 5 6] y 

7 8 9 7 8 9 z 


11.3 Review Problems 


Reading Problems lea 2a 
Webwork: Basis checks 3,4 
Computing column vectors 5,6 


1. (a) Draw the collection of all unit vectors in R?. 


(b) Let S, = te ah where x is a unit vector in R?. For which zx 
is S, a basis of R?? 


(c) Generalize to R”. 


2. Let B” be the vector space of column vectors with bit entries 0, 1. Write 
down every basis for Bt and B?. How many bases are there for B°? 
B’? Can you make a conjecture for the number of bases for B”? 


(Hint: You can build up a basis for B” by choosing one vector at a 
time, such that the vector you choose is not in the span of the previous 
vectors you’ve chosen. How many vectors are in the span of any one 
vector? Any two vectors? How many vectors are in the span of any k 
vectors, for k < n?) 
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Basis and Dimension 


AN Hint 


3. Suppose that V is an n-dimensional vector space. 


(a) Show that any n linearly independent vectors in V form a basis. 


(Hint: Let {wi,...,Wm} be a collection of n linearly independent 
vectors in V, and let {v1,..., Un} be a basis for V. Apply the 
method of Lemma 11.0.2 to these two sets of vectors.) 


(b) Show that any set of n vectors in V which span V forms a basis 
for V. 


(Hint: Suppose that you have a set of n vectors which span V but 
do not form a basis. What must be true about them? How could 
you get a basis from this set? Use Corollary 11.0.3 to derive a 
contradiction. ) 


. Let S = {v1,...,Un} be a subset of a vector space V. Show that if every 


vector w in V can be expressed uniquely as a linear combination of vec- 
tors in S, then S is a basis of V. In other words: suppose that for every 
vector w in V, there is exactly one set of constants c!,...,c” so that 
clu; +--+ cv, = w. Show that this means that the set S is linearly 
independent and spans V. (This is the converse to theorem 11.0.1.) 


. Vectors are objects that you can add together; show that the set of all 


linear transformations mapping R? — R is itself a vector space. Find a 
basis for this vector space. Do you think your proof could be modified 
to work for linear transformations R” —> R? For RN > R™? For RE? 


Hint: Represent R? as column vectors, and argue that a linear trans- 
formation T: R? + R is just a row vector. 


. Let S,, denote the vector space of all nxn symmetric matrices M = MT. 


Let A, denote the vector space of all n x n anti-symmetric matrices 
MT = —M. 


(a) Find a basis for 53. 
(b) Find a basis for A3. 
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(c) Can you find a basis for Sp? For An? 


Hint: Describe it in terms of the matrices Fi which have a 1 in 


the i-th row and the j-th column and 0 everywhere else. Note that 
{Fj|1<i<r,1 <j <k} is a basis for Mj. 


7. Give the matrix of the linear transformation L with respect to the input 
and output bases B and B’ listed below: 


(a) L: V + W where B = (v1,...,Un) is a basis for V and B’ = 
(L(v1),..-,L(Un)) is a basis for W. 


(b) L: V > V where B = B’ = (v1,..., Un) and L(vj) = Aivi. 
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Eigenvalues and Eigenvectors 


Given only a vector space and no other structure, save for the zero vector, no 
vector is more important than any other. Once one also has a linear trans- 
formation the situation changes dramatically. Consider a vibrating string, 


! 


whose displacement at point x is given by a function y(x, t). The space of all 
displacement functions for the string can be modeled by a vector space V. At 
this point, only the zero vector—the function y(x,t) = 0 drawn in grey—is 
the only special vector. The wave equation 

y Oy 


at? Ox?’ 


is a good model for the string’s behavior in time and space. Hence we now 
have a linear transformation 


o? o? 
(sa - a) :V >V. 
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Eigenvalues and Eigenvectors 


For example, the function 
y(a,t) = sin tsin x 


is a very special vector in V, which obeys Ly = 0 . It is an example of an 
eigenvector of L. 


12.1 Invariant Directions 


Have a look at the linear transformation L depicted below: 


F 2 a 2 
: R Linear R 


L(e,) + L(e.) 
=Lle,+e2) 


L fe,) 


It was picked at random by choosing a pair of vectors L(e,) and L(e2) as 
the outputs of L acting on the canonical basis vectors. Notice how the unit 
square with a corner at the origin is mapped to a parallelogram. The second 
line of the picture shows these superimposed on one another. Now look at the 
second picture on that line. There, two vectors fı and fz have been carefully 
chosen such that if the inputs into L are in the parallelogram spanned by fi 
and f2, the outputs also form a parallelogram with edges lying along the same 
two directions. Clearly this is a very special situation that should correspond 
to interesting properties of L. 

Now lets try an explicit example to see if we can achieve the last picture: 


Example 115 Consider the linear transformation L such that 


# (0) = (an) 4 #(3) = G) 
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er 


: i . . . 1 
Recall that a vector is a direction and a magnitude; L applied to (5) or (o) changes 


so that the matrix of L is 


both the direction and the magnitude of the vectors given to it. 


Notice that 
L 3\ / —4:3+3-5\_/83 
5J = \-10-3+7-5/ \5/` 


Then L fixes the direction (and actually also the magnitude) of the vector vı = (5). 


On Reading homework: problem 1 


Now, notice that any vector with the same direction as vı can be written as cv; 
for some constant c. Then L(cv,) = cL(vi) = cvi, so L fixes every vector pointing 
in the same direction as v1. 

Also notice that 


(3) = (0117.2) = (4) =2(). 


n . . 1 
so L fixes the direction of the vector vg = 6 but stretches vg by a factor of 2. 


Now notice that for any constant c, L(cv2) = cL (v2) = 2cvg. Then L stretches every 
vector pointing in the same direction as v2 by a factor of 2. 


In short, given a linear transformation L it is sometimes possible to find a 
vector v Æ 0 and constant A Æ 0 such that Lv = Av. We call the direction of 
the vector v an invariant direction. In fact, any vector pointing in the same 
direction also satisfies this equation because L(cv) = cL(v) = Acv. More 
generally, any non-zero vector v that solves 


Lv = Xv 


is called an eigenvector of L, and A (which now need not be zero) is an 
eigenvalue. Since the direction is all we really care about here, then any other 
vector cv (so long as c Æ 0) is an equally good choice of eigenvector. Notice 
that the relation “u and v point in the same direction” is an equivalence 
relation. 
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P aa i 
L(v)=aAv 


Eigen vector v = O 


Figure 12.1: The eigenvalue—eigenvector equation is probably the most im- 
portant one in linear algebra. 


In our example of the linear transformation L with matrix 


—4 3 
= LO? 7)? 


we have seen that L enjoys the property of having two invariant directions, 
represented by eigenvectors vı and v2 with eigenvalues 1 and 2, respectively. 

It would be very convenient if we could write any vector w as a linear 
combination of vı and vg. Suppose w = rv; + sv for some constants r and s. 
Then: 

L(w) = L(rvy + sve) = rL(vi) + sL(v2) = rvi + 2sv2. 

Now L just multiplies the number r by 1 and the number s by 2. If we 

could write this as a matrix, it would look like: 


1 0 S 
0 2 t 
which is much slicker than the usual scenario 
1{*\_(¢ b\(x\ _ fax + by 
y) \c d}\y) \ca+dy)~ 


Here, s and t give the coordinates of w in terms of the vectors vı and v2. In 
the previous example, we multiplied the vector by the matrix L and came up 
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with a complicated expression. In these coordinates, we see that L has a very 
simple diagonal matrix, whose diagonal entries are exactly the eigenvalues 
of L. 

This process is called diagonalization. It makes complicated linear sys- 
tems much easier to analyze. 


ara Reading homework: problem 2 


Now that we’ve seen what eigenvalues and eigenvectors are, there are a 
number of questions that need to be answered. 


e How do we find eigenvectors and their eigenvalues? 


e How many eigenvalues and (independent) eigenvectors does a given 
linear transformation have? 


e When can a linear transformation be diagonalized? 


We'll start by trying to find the eigenvectors for a linear transformation. 


An 2x 2 Example 


Example 116 Let L: R? — R? such that L(x, y) = (2x + 2y, 16x + 6y). First, we 


find the matrix of L: 
£ L, 2 2 x 
y 16 6/ \y/` 


We want to find an invariant direction v = (5) such that 


Lv = Xv 


or, in matrix notation, 
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bes . {(2-xX 2 
This is a homogeneous system, so it only has solutions when the matrix ( 1 ) 


6 6-A 
is singular. In other words, 
2— À 2 
det ( 16 > FA 


& (2—A\(6-—A)—-32 = 0 
S àA? — 8—20 = 0 
s (A= 10)(å+2) = 0 


For any square n x n matrix M, the polynomial in À given by 
Pu(A) = det(AI — M) = (—1)” det(M — AT) 


is called the characteristic polynomial of M, and its roots are the eigenvalues of M. 
In this case, we see that L has two eigenvalues, Ay = 10 and Ag = —2. To find the 
eigenvectors, we need to deal with these two cases separately. To do so, we solve the 


linear system ATTA , Va with the particular eigenvalue À plugged 


in to the matrix. 


A = 10: We solve the linear system 
=8) 2\/z\_/0 
16 —4)/ Wy) \O/° 
Both equations say that y = 4x, so any vector Gs will do. Since we only 


need the direction of the eigenvector, we can pick a value for x. Setting x = 1 


. : . l 1 
is convenient, and gives the eigenvector vı = 4 


(is 8) G)=(0), 


Here again both equations agree, because we chose À to make the system 


A = —2: We solve the linear system 


1 
singular. We see that y = —2x works, so we can choose v2 = G 


Our process was the following: 
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e Find the characteristic polynomial of the matrix M for L, given by! det(AT—M). 
e Find the roots of the characteristic polynomial; these are the eigenvalues of L. 


e For each eigenvalue \;, solve the linear system (M — ;J)v = 0 to obtain an 
eigenvector v associated to \;. 


AN Jordan block example 


12.2 The Eigenvalue—Eigenvector Equation 


In section 12, we developed the idea of eigenvalues and eigenvectors in the 
case of linear transformations R? — R?. In this section, we will develop the 
idea more generally. 


An Eigenvalues 


Definition For a linear transformation L: V > V, then À is an eigenvalue 
of L with eigenvector v Æ Oy if 


Lv = Nv. 


This equation says that the direction of v is invariant (unchanged) under L. 
Let’s try to understand this equation better in terms of matrices. Let V 
be a finite-dimensional vector space and let L: V — V. If we have a basis 
for V we can represent L by a square matrix M and find eigenvalues A and 
associated eigenvectors v by solving the homogeneous system 
(M — XI)v = 0. 


This system has non-zero solutions if and only if the matrix 


M-AI 


is singular, and so we require that 


‘To save writing many minus signs compute det(M — AJ); which is equivalent if you 
only need the roots. 


213 


214 Eigenvalues and Eigenvectors 


Figure 12.2: Don’t forget the characteristic polynomial; you will need it to 
compute eigenvalues. 


det(AI — M) = 0. 


The left hand side of this equation is a polynomial in the variable A 
called the characteristic polynomial Py(A) of M. For an n x n matrix, the 
characteristic polynomial has degree n. Then 


Pul A) = AX + AP He + en. 


Notice that Py,(0) = det(—M) = (—1)" det M. 

The fundamental theorem of algebra states that any polynomial can be 
factored into a product of first order polynomials over C. Then there exists 
a collection of n complex numbers A; (possibly with repetition) such that 


RS SS eS 


The eigenvalues À; of M are exactly the roots of Pyy(A). These eigenvalues 
could be real or complex or zero, and they need not all be different. The 
number of times that any given root A; appears in the collection of eigenvalues 
is called its multiplicity. 


Example 117 Let L be the linear transformation L: R? — R? given by 


£ 2r+y-z 
Lly|= r+2y-—z : 
z —£-—yt2z 
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In the standard basis the matrix M representing L has columns Le; for each i, so: 


x 2 1 -1\ /z 
glee | ı 2 ally 
z —1]1 -l 2 Z 


Then the characteristic polynomial of L is? 


A-2 -1 1 
Py(A) = det —]1 A-2 1 
1 1 A-— 2 


= (A-2)[A-2)? - 1} +[-Q—-2)-1]4+[-A-2)-]] 
= (1h =a) 


So L has eigenvalues A; = 1 (with multiplicity 2), and Ag = 4 (with multiplicity 1). 
To find the eigenvectors associated to each eigenvalue, we solve the homogeneous 
system (M — X;I)X = 0 for each i. 


A = 4: We set up the augmented matrix for the linear system: 


—2 1 —1 0 1 -2 —1 0 
1 —2 -1/0} ~ |0 -3 -310 
—1 —1 —2/0 0 -3 -3 0 
1 0 140 
~ 10 1 140 
0 0 0/0 
So we see that z = z =: t, y = —t, and x = —t gives a formula for eigenvectors 
—1 
in terms of the free parameter t. Any such eigenvector is of the form t | —1 |; 
I 


thus L leaves a line through the origin invariant. 


A = 1: Again we set up an augmented matrix and find the solution set: 


1 1 —1/0 1 1-1/0 
1 1 -1/0 ~ 0 0 O;0 
—1 -l 110 0 0 00 


?Tt is often easier (and equivalent) to solve det(M — AI) = 0. 
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Then the solution set has two free parameters, s and t, such that z = z =: t, 


y =y =: s, and x =—s-+t. Thus L leaves invariant the set: 
—1 1 
s 1| +t] 0] \s,tER 
0 1 
This set is a plane through the origin. So the multiplicity two eigenvalue has 
—1 1 
two independent eigenvectors, 1| and | 0] that determine an invariant 
0 1 


plane. 


Example 118 Let V be the vector space of smooth (i.e. infinitely differentiable) 
functions f: R — R. Then the derivative is a linear operator d V > V. What are 
the eigenvectors of the derivative? In this case, we don’t have a matrix to work with, 
so we have to make do. 

A function f is an eigenvector of 4 if there exists some number A such that f 5 
Af. An obvious candidate is the exponential function, eò: indeed, Le = er”, 
The operator 4 has an eigenvector eò” for every À € R. 


12.3 Eigenspaces 


In the previous example, we found two eigenvectors 


—1 1 
1] and {0 
0 1 


for L, both with eigenvalue 1. Notice that 


—1 1 
1ļ]+{0]|=]Į|1 
0 1 1 


is also an eigenvector of L with eigenvalue 1. In fact, any linear combination 


—1 1 
p 1] +sļ|0 
0 1 
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of these two eigenvectors will be another eigenvector with the same eigen- 
value. 

More generally, let {v1, v2, ...} be eigenvectors of some linear transforma- 
tion L with the same eigenvalue À. A linear combination of the v; can be 
written civ, + eu +--+ for some constants {c',c?,...}. Then: 


cd Lu + Clas +- by linearity of L 
chu, + Avo +--+ since Lu; = dv; 


elu, + evt). 


L(c'v, + ev + +++) 


So every linear combination of the v; is an eigenvector of L with the same 
eigenvalue à. In simple terms, any sum of eigenvectors is again an eigenvector 
if they share the same eigenvalue. 

The space of all vectors with eigenvalue A is called an ezgenspace. It 
is, in fact, a vector space contained within the larger vector space V: It 
contains Oy, since LOy = Oy = AOyv, and is closed under addition and scalar 
multiplication by the above calculation. All other vector space properties are 
inherited from the fact that V itself is a vector space. In other words, the 
subspace theorem (9.1.1, chapter 9) ensures that Vy := {v € V|Lv =O} isa 


subspace of V. 
rN Eigenspaces 


oo Reading homework: problem 3 


You can now attempt the second sample midterm. 
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12.4 Review Problems 


Reading Problems Len, Zeon, 3604 
Characteristic polynomial A. 3: 6 
Eigenvalues 4,3 
Webwork: Eigenspaces 9, 10 
Eigenvectors 11 1213-14 
Complex eigenvalues 15 


1. Try to find more solutions to the vibrating string problem 07y/0t? = 
0°y/Ox? using the ansatz 


y(x, t) = sin(wt) f(x). 


What equation must f(x) obey? Can you write this as an eigenvector 
equation? Suppose that the string has length L and f(0) = f(L) = 0. 
Can you find any solutions for f(x)? 


2 1 
0 2 
independent eigenvectors? Is there a basis in which the matrix of M is 
diagonal? (I.e., can M be diagonalized?) 


2. Let M = ( . Find all eigenvalues of M. Does M have two linearly 


3. Consider L: R? > R? with 
T xcos@ + ysin @ 
L = . . 
y —zx sin 0 + y cos 0 


(a) Write the matrix of L in the basis 3 : a 


1 
(b) When 0 ¥ 0, explain how L acts on the plane. Draw a picture. 
(c) Do you expect L to have invariant directions? 


(d) Try to find real eigenvalues for L by solving the equation 
L(v) = dv. 


(e) Are there complex eigenvalues for L, assuming that i = y—1 
exists? 
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4. Let L be the linear transformation L: R? — R? given by 


x Ys ra] 
LAG) S| eee 
ž y+z 


Let e; be the vector with a one in the ith position and zeros in all other 
positions. 


(a) Find Le; for each i. 
mi mz m3 
(b) Given a matrix M = | m? m3} m3 |, what can you say about 
mi m3 m3 
Me; for each i? 
(c) Find a 3 x 3 matrix M representing L. Choose three nonzero 
vectors pointing in different directions and show that Mv = Lv 


for each of your choices. 


(d) Find the eigenvectors and eigenvalues of M. 


5. Let A be a matrix with eigenvector v with eigenvalue À. Show that v 
is also an eigenvector for A? and what is its eigenvalue? How about for 
A” where n € N? Suppose that A is invertible. Show that v is also an 
eigenvector for A. 


6. A projection is a linear operator P such that P? = P. Let v be an 
eigenvector with eigenvalue À for a projection P, what are all possible 
values of \? Show that every projection P has at least one eigenvector. 


Note that every complex matrix has at least 1 eigenvector, but you 
need to prove the above for any field. 


7. Explain why the characteristic polynomial of an n x n matrix has de- 
gree n. Make your explanation easy to read by starting with some 
simple examples, and then use properties of the determinant to give a 
general explanation. 


8. Compute the characteristic polynomial Pm(A) of the matrix 


"(e 
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Now, since we can evaluate polynomials on square matrices, we can 
plug M into its characteristic polynomial and find the matriz Py(M). 
What do you find from this computation? Does something similar hold 
for 3 x 3 matrices? (Try assuming that the matrix of M is diagonal to 
answer this.) 


9. Discrete dynamical system. Let M be the matrix given by 


w= (22). 
x(0) 
y(0) 


vectors u(1), v(2),u(3), and so on using the rule: 


Given any vector u(0) = ) , we can create an infinite sequence of 


v(t + 1) = Mv(t) for all natural numbers t. 


(This is known as a discrete dynamical system whose initial condition 
is v(0).) 


(a) Find all eigenvectors and eigenvalues of M. 
(b) Find all vectors v(0) such that 


(Such a vector is known as a fixed point of the dynamical system.) 


(c) Find all vectors v(0) such that v(0), v(1), v(2), v(3),... all point in 
the same direction. (Any such vector describes an invariant curve 
of the dynamical system.) 


@ Hint 
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Diagonalization 
Given a linear transformation, it is highly desirable to write its matrix with 
respect to a basis of eigenvectors: 


13.1 Diagonalizability 


Suppose we are lucky, and we have L: V — V, and the ordered basis B = 


(U1,..-,Un) is a set of eigenvectors for L, with eigenvalues \1,..., An. Then: 
L(v1) = A11 
L(v2) = Agve 
L(Un) = Ann 


x! Ài x! 
r? Àa r? 
L = , 
g A n 
B n X B 


where all entries off of the diagonal are zero. 
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Diagonalization 


Suppose that V is any n-dimensional vector space. We call a linear trans- 
formation L: V + V diagonalizable if there exists a collection of n linearly 
independent eigenvectors for L. In other words, L is diagonalizable if there 
exists a basis for V of eigenvectors for L. 

In a basis of eigenvectors, the matrix of a linear transformation is diag- 
onal. On the other hand, if an n x n matrix is diagonal, then the standard 
basis vectors e; must already be a set of n linearly independent eigenvectors. 
We have shown: 


Theorem 13.1.1. Given an ordered basis B for a vector space V and a 
linear transformation L: V —> V, then the matrix for L in the basis B is 
diagonal if and only if B consists of eigenvectors for L. 


rN Non-diagonalizable example 


ON Reading homework: problem 1 


Typically, however, we do not begin a problem with a basis of eigenvec- 
tors, but rather have to compute these. Hence we need to know how to 
change from one basis to another: 


13.2 Change of Basis 


Suppose we have two ordered bases S = (v1,...,Un) and S’ = (vj,...,v},) 
for a vector space V. (Here v; and v; are vectors, not components of vectors 
in a basis!) Then we may write each v; uniquely as a linear combination of 


the Uj: 


I i 
vj = ViP}; , 
i 


or in matrix notation 


1 1 1 
Pi P2 `t Pn 
2 2 
PE ; Pi P2 
(v, v u!) = (v1, v2 Un) ; . 
n n 
I Pn 
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Here, the p are constants, which we can regard as entries of a square ma- 
trix P = (p4). The matrix P must have an inverse, since we can also write 
each v; uniquely as a linear combination of the vj: 


= > k 
Vj = URW; g 
k 


Then we can write: 


u= >> 2 oap: 


k 


But $; arp is the k, j entry of the product matrix QP. Since the expression 


for v; in the basis S' is v; itself, then QP maps each v; to itself. As a result, 
each v; is an eigenvector for QP with eigenvalue 1, so QP is the identity, t.e. 


PQ=QP=I& Q=P +. 


The matrix P is called a change of basis matrix. There is a quick and 
dirty trick to obtain it: Look at the formula above relating the new basis 


vectors v}, U}, ... v, to the old ones v1, v2,...,Un. In particular focus on vi 
for which 
pi 
pi 
v = (v1, v2, Un) : 
pi 


This says that the first column of the change of basis matrix P is really just 
the components of the vector v| in the basis v1, v2,...,Un, SO: 


The columns of the change of basis matrix are the components 
of the new basis vectors in terms of the old basis vectors. 


Example 119 Suppose S’ = (v1, v4) is an ordered basis for a vector space V and that 
with respect to some other ordered basis S' = (v1, v2) for V 
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Diagonalization 


This means 


1 1 
= Wp} _ Ut v2 nd oo WE) I 
vi (v1, v2) (2) v2 a Va (v1, v2) =A ZB 


The change of basis matrix has as its columns just the components of v} and v5; 


Changing basis changes the matrix of a linear transformation. However, 
as a map between vector spaces, the linear transformation is the same 
no matter which basis we use. Linear transformations are the actual 
objects of study of this book, not matrices; matrices are merely a convenient 
way of doing computations. 


An Change of Basis Example 


Lets now calculate how the matrix of a linear transformation changes 


when changing basis. To wit, let L: V — W with matrix M = (m‘) in the 


ordered input and output bases S = (v1, ..., Un) and T = (w1,..., Wm) so 
k 


Now, suppose S’ = (v;,...,v,,) and T” = (w},..., w) are new ordered input 
and out bases with matrix M’ = (m'¥). Then 


L(v;) = a, wer . 
k 


Let P = (p¿) be the change of basis matrix from input basis S to the basis 


S' and Q = (af) be the change of basis matrix from output basis T to the 
basis T”. Then: 


L(v;) = L (= va) = > Lwidp; = 5 y wei; 


i 
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Meanwhile, we have: 
L(v.) = S um} = ` D 
k k j 


Since the expression for a vector in a basis is unique, then we see that the 
entries of MP are the same as the entries of QM’. In other words, we see 
that 


MP=QM o M=Q"'MP. 


Example 120 Let V be the space of polynomials in t and degree 2 or less and L : 


V — R? where 
Ea (3) ios @ . L(t?) = (3) 


From this information we can immediately read off the matrix M of L in the bases 
S = (1,t,t?) and T = (e1,e2), the standard basis for R?, because 


(L(1), L(t), L(¢?)) = (e1 + 2e2, 2e1 + e2, 3e1 + 3e2) 


pa PS 
g (ene) (5 1 aM = (5 1 J 


Now suppose we are more interested in the bases 


See 1+, T= (G) ; Gy =: (wi, ws). 


To compute the new matrix M’ of L we could simply calculate what L does the the 
new input basis vectors in terms of the new output basis vectors: 


(3) +G)-G) +(@)-G)+G)) 


(wy + we, w1 + 2we, 2we + w1) 


E bane th. G2 
(nw) (F 2 i) == (j 2 T 


Alternatively we could calculate the change of basis matrices P and Q by noting that 


(LO +t)L(t +t), L(+ t)) 


1 0 1 1 0 1 
eee 1+ =, |1 1 0| = P=|110 
011 011 
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and 
Z 1 2 1 2 
(w1, w2) = (e1 + 2e2, 2e1 + e2) = (e1, €1) ¢ i) > Q= e i) ` 


Hence 


1 0 1 
1/1 -A23 1 1 2 
M' = -1M EE = 
Setar aie JO 1 )( A ) G 2 J 


Notice that the change of basis matrices P and Q are both square and invertible. 
Also, since we really wanted Qt, it is more efficient to try and write (e1,e2) in 
terms of (w1, w2) which would yield directly Q~+. Alternatively, one can check that 
MP = QM". 


13.3 Changing to a Basis of Eigenvectors 


If we are changing to a basis of eigenvectors, then there are various simplifi- 
cations: 


e Since L : V > V, most likely you already know the matrix M of L 
using the same input basis as output basis S = (u1,..., Un) (say). 


e In the new basis of eigenvectors S’(v1,...,Un), the matrix D of L is 
diagonal because Lv; = A;v; and so 


Ki “Os ee: D 
0 Ag 0 
(L(v1), L(v2), ..-, L(un)) = (U1, Va, ---, Un) . o . 
0 0 e An 


e If P is the change of basis matrix from S to S’, the diagonal matrix of 
eigenvalues D and the original matrix are related by 


D = PHMP 


This motivates the following definition: 
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Definition A matrix M is diagonalizable if there exists an invertible matrix 
P and a diagonal matrix D such that 


D = PMP. 
We can summarize as follows: 


e Change of basis rearranges the components of a vector by the change 
of basis matrix P, to give components in the new basis. 


e To get the matrix of a linear transformation in the new basis, we con- 
jugate the matrix of L by the change of basis matrix: M+ P-!MP. 


If for two matrices N and M there exists a matrix P such that M = 
P-!NP, then we say that M and N are similar. Then the above discussion 
shows that diagonalizable matrices are similar to diagonal matrices. 


Corollary 13.3.1. A square matrix M is diagonalizable if and only if there 
exists a basis of eigenvectors for M. Moreover, these eigenvectors are the 
columns of the change of basis matrix P which diagonalizes M. 


"ar Reading homework: problem 2 


Example 121 Let’s try to diagonalize the matrix 


—14 —28 —44 
M=y,| -7 -14 —23 
9 18 29 


The eigenvalues of M are determined by 
det(M = AI) ==)? +? 42 =0. 


So the eigenvalues of M are —1,0, and 2, and associated eigenvectors turn out to be 


—8 —2 —1 
vu = |—-1], w= 1], and v3 = |—1 
3 0 1 


In order for M to be diagonalizable, we need the vectors v1, v2, v3 to be linearly 
independent. Notice that the matrix 


-8 -2 -1 
P=(v v v)=ļ|-1 1 -1 
3 0 1 
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Figure 13.1: This theorem answers the question: “What is diagonalization?” 


is invertible because its determinant is —1. Therefore, the eigenvectors of M form a 
basis of R, and so M is diagonalizable. Moreover, because the columns of P are the 
components of eigenvectors, 


-1 0 0 
MP = (Mv, Mv, Mv3) = (-1.v1 0.v2 2.v3) = (vı v2 v3) | 0 0 0 
0 0 2 
Hence, the matrix P of eigenvectors is a change of basis matrix that diagonalizes M: 
-1 0 0 
P'MP=| 0 0 0 
0 0 2 


rN 2x 2 Example 


13.4 Review Problems 


Reading Problems | 1, 2 
Webwork: | No real eigenvalues 3 
Diagonalization A. 5.0. T 


1. Let P,,(t) be the vector space of polynomials of degree n or less, and 


2: P (t) > P,(t) be the derivative operator. Find the matrix of 4 
in the ordered bases Æ = (1,t,...,¢") for P,(t) and F = (1,t,..., t”) 


for P,(t). Determine if this derivative operator is diagonalizable. 


Recall from chapter 6 that the derivative operator is linear . 
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2. When writing a matrix for a linear transformation, we have seen that 
the choice of basis matters. In fact, even the order of the basis matters! 


(a) Write all possible reorderings of the standard basis (e1, e2, e3) 
for R3. 


(b) Write each change of basis matrix between the standard basis 
and each of its reorderings. Make as many observations as you 
can about these matrices: what are their entries? Do you notice 
anything about how many of each type of entry appears in each 
row and column? What are their determinants? (Note: These 
matrices are known as permutation matrices.) 


(c) Given L : R3 > R? is linear and 


£ 2y — z 
L|yļ| = 3x 
Z 2z+at+y 


write the matrix M for L in the standard basis, and two reorder- 
ings of the standard basis. How are these matrices related? 


3. Let 
X={0,h,a}, Y= {x4}. 


Write down two different ordered bases, S,S’ and T, T’ respectively, 
for each of the vector spaces R* and RY. Find the change of basis 
matrices P and Q that map these bases to one another. Now consider 
the map 


L:Y> X, 


where (x) = Ọ and ¢(x) = @. Show that £ can be used to define a 
linear transformation L : R* — RY. Compute the matrices M and 
M’ of L in the bases S,T and then S’,7’. Use your change of basis 
matrices P and Q to check that M’ = Q-!MP. 


4. Recall that tr MN = tr NM. Use this fact to show that the trace of a 
square matrix M does not depend not the basis you used to compute M. 


5. When is the 2 x 2 matrix é a diagonalizable? Include examples in 


your answer. 
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6. Show that similarity of matrices is an equivalence relation. (The defi- 
nition of an equivalence relation is given in the background WeBWork 
set.) 


7. Jordan form 


e Can the matrix ; : be diagonalized? Either diagonalize it or 
explain why this is impossible. 
A 1 0 
e Can the matrix | 0 A 1] be diagonalized? Either diagonalize 
0 0 A 
it or explain why this is impossible. 
A 10...00 
OA 1-::-: 0 0 
0 0A -:: 0 
e Can then x n matrix |... , . , | be diagonalized? 
00 0-+. A 1 
0 0 --- O A 


Either diagonalize it or explain why this is impossible. 


Note: It turns out that every matrix is similar to a block ma- 
trix whose diagonal blocks look like diagonal matrices or the ones 
above and whose off-diagonal blocks are all zero. This is called 
the Jordan form of the matrix and a (maximal) block that looks 


like 
A 1 0- O 
0A 1 0 
A 1 
0 0 0 A 


is called a Jordan n-cell or a Jordan block where n is the size of 
the block. 


8. Let A and B be commuting matrices (7.e., AB = BA) and suppose 
that A has an eigenvector v with eigenvalue A. Show that Bv is also 
an eigenvector of A with eigenvalue À. Additionally suppose that A 
is diagonalizable with distinct eigenvalues. What is the dimension of 
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each eigenspace of A? Show that v is also an eigenvector of B. Explain 
why this shows that A and B can be simultaneously diagonalized (i.e. 
there is an ordered basis in which both their matrices are diagonal.) 
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Orthonormal Bases and Complements 


You may have noticed that we have only rarely used the dot product. That 
is because many of the results we have obtained do not require a preferred 
notion of lengths of vectors. Once a dot or inner product is available, lengths 
of and angles between vectors can be measured—very powerful machinery and 
results are available in this case. 


14.1 Properties of the Standard Basis 


The standard notion of the length of a vector x = (£1, £2,..., £n) E€ R” is 


lel] = v£ -£ = y (21)? + (G2)? + +++ Gn)? 


The canonical/standard basis in R” 


ray 
KR 
OO 


ey = 5 , e2 = x , erty En = A ’ 


has many useful properties with respect to the dot product and lengths: 


e Each of the standard basis vectors has unit length: 
lleill = veir ei = \/e7e; = 1. 
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e The standard basis vectors are orthogonal (in other words, at right 


angles or perpendicular): 
eite = ee =O whan ij 


This is summarized by 


1 Rey 
Pey=bu=4 a TE 


where 0;; is the Kronecker delta. Notice that the Kronecker delta gives the 


entries of the identity matrix. 


Given column vectors v and w, we have seen that the dot product v +w is 


the same as the matrix multiplication v’w. This is an inner product on 


R”. 


We can also form the outer product vw", which gives a square matrix. The 


outer product on the standard basis vectors is interesting. Set 


Ty = ee? 
1 
0 
= | J| @ 0- 0) 
0 
LG. eee. 0 
00 -:: 0 
00>.. 0 
Iip = enel 
0 
0 
= |. (Os Oh. See 1) 
1 
00>. 0 
00>. 0 
00 - 1 
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In short, Il; is the diagonal square matrix with a 1 in the ith diagonal position 
and zeros everywhere else!. 
Notice that ILII; = ee ejes = Cue, Then: 


II; i=j 
ILI; = l 0 pe 
Moreover, for a diagonal matrix D with diagonal entries \,,...,An, we can 


write 
D = All +- + Anll. 


14.2 Orthogonal and Orthonormal Bases 


There are many other bases that behave in the same way as the standard 
basis. As such, we will study: 


e Orthogonal bases {v1,..., Un}: 
(t= Vt age 
In other words, all vectors in the basis are perpendicular. 
e Orthonormal bases {u1,...,Un}: 
Uj * Uj = 043. 


In addition to being orthogonal, each vector has unit length. 


Suppose T = {u1,..., Un} is an orthonormal basis for R”. Because T is 
a basis, we can write any vector v uniquely as a linear combination of the 
vectors in T: 


v = cluj +- Up. 


Since T is orthonormal, there is a very easy way to find the coefficients of this 
linear combination. By taking the dot product of v with any of the vectors 


lThis is reminiscent of an older notation, where vectors are written in juxtaposition. 
This is called a “dyadic tensor”, and is still used in some applications. 
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in T, we get: 


veu; = clu eu; t-e p Eu e ute tC Un Uy; 
= g eed ge l ep0 
= Ë, 

> ġ = veu 

=> v = (veut: + (V Unun 


= Xo * ui Yui. 


i 
This proves the theorem: 


Theorem 14.2.1. For an orthonormal basis {u1,..., Un}, any vector v can 


be expressed as 
v= > 0 $ Uj) Uj. 


i 


ON Reading homework: problem 1 


An All orthonormal bases for R? 


14.3 Relating Orthonormal Bases 


Suppose T = {u,...,Un} and R = {w1,..., Wn} are two orthonormal bases 
for R”. Then: 

Wy = (wy : ujua +--+ (w $ Un lUn 

Wn = (Watt )ur +- + (Wnt Un)Un 


4 
g 
I 


> ulu * Wi) 
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Thus the matrix for the change of basis from T to R is given by 
P = (PÌ) = (u + wi). 


We would like to calculate the product PPT. For that, we first develop a 
dirty trick for products of dot products: 


(u.v)(w.z) = (uTv)(wTz) = u (vw")z. 


The object vw? is the square matrix made from the outer product of v and 
w! Now we are ready to compute the components of the matrix product 
PPT: 


D + ws) (wy Ue) = X jwi) (wr Ue) 


i i 


ie 


Uy Ink 
T 
= UU, = Òjk- 


The equality (*) is explained below. Assuming (*) holds, we have shown that 
PPT = I,, which implies that 


PË = P. 


The equality in the line (*) says that X; wiw? = I,. To see this, we 


examine (X); wiw?) v for an arbitrary vector v. We can find constants c’ 


such that v = )), dwj, so that: 


et) = (oe) (4 
> a) www; 


= y cw; since all terms with i 4 j vanish 


J 
= Ww, 
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Thus, as a linear transformation, X`, wwf = In fixes every vector, and thus 
must be the identity Zn. 


Definition A matrix P is orthogonal if P~' = PT. 
Then to summarize, 


Theorem 14.3.1. A change of basis matrix P relating two orthonormal bases 


is an orthogonal matriz. I.e., 
Papi, 


ee Reading homework: problem 2 


Example 122 Consider R with the orthonormal basis 


2 0 a 
V6 V3 
1 1 —1 
S= ui = V6 U2 = V2 » U3 = J3 
i 1 1 
v6 v2 v3 


Let Æ be the standard basis {e1,e2,e3}. Since we are changing from the standard 
basis to a new basis, then the columns of the change of basis matrix are exactly the 
standard basis vectors. Then the change of basis matrix from E to S' is given by: 


m 
ran 


f "Uy €1° U2? €1° U3 
P= (P}) = (gu) = 


= (wu U2 u3) = 


From our theorem, we observe that: 


Uy €2°U2 €2° UZ 
Uy €3°U2Q €3° u3 


TENE 
al-Sl- o 
al-s- 


N 
SSE 


P= pP" = 


& Q 
m 


Il 
SS E a 


a oae 
alLa- 
aS- 
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We can check that PTP = I by a lengthy computation, or more simply, notice 
that 


yl 
(PTP) = "| (u1 u2 us) 
u3 
1 0 0 
= 0 1 0 
0 0 1 


Above we are using orthonormality of the u; and the fact that matrix multiplication 
amounts to taking dot products between rows and columns. It is also very important 
to realize that the columns of an orthogonal matrix are made from an orthonormal 
set of vectors. 


Orthonormal Change of Basis and Diagonal Matrices. Suppose D is a diagonal 
matrix and we are able to use an orthogonal matrix P to change to a new basis. Then 
the matrix M of D in the new basis is: 


M = PDP! = PDP”. 
Now we calculate the transpose of M. 


MT = (PDP*)? 


II 
Y 
5 
D 
3 


The matrix M = PDP? is symmetric! 


14.4 Gram-Schmidt & Orthogonal Complements 


Given a vector v and some other vector u not in span {v}, we can construct 
a new vector: 
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A 
u 
v 
Uv y — yl 
UU 
a 
yl 
This new vector v+ is orthogonal to u because 
u-v 
uvt =urv——u-u=0. 
u-u 


Hence, {u,v}} is an orthogonal basis for span{u, v}. When v is not par- 


yt 
[v+] 


allel to u, vt Æ 0, and normalizing these vectors we obtain f , an 


orthonormal basis for the vector space span {u, v}. 
Sometimes we write v = vt + vll where: 


U:U 
ut = VU—-—U 
U:U 
Il U:U 
v = —u 
U:U 


This is called an orthogonal decomposition because we have decomposed v 
into a sum of orthogonal vectors. This decomposition depends on u; if we 
change the direction of u we change vt and vll. 

If u, v are linearly independent vectors in R*, then the set {u, vt, u x vt} 
would be an orthogonal basis for R3. This set could then be normalized by 
dividing each vector by its length to obtain an orthonormal basis. 

However, it often occurs that we are interested in vector spaces with di- 
mension greater than 3, and must resort to craftier means than cross products 
to obtain an orthogonal basis’. 


? Actually, given a set T of (n — 1) independent vectors in n-space, one can define an 
analogue of the cross product that will produce a vector orthogonal to the span of T, using 
a method exactly analogous to the usual computation for calculating the cross product of 
two vectors in R3. This only gets us the last orthogonal vector, though; the process in 
this Section gives a way to get a full orthogonal basis. 
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Given a third vector w, we should first check that w does not lie in the 


span of u and v, i.e., check that u,v and w are linearly independent. If it 
does not, we then can define: 


We can check that uwt and v+» w% are both zero: 


i Ae 
w utew i 
=u: w — —— u'u- -ru 
Urey 
view 7 
=urw-u'w Te 0 
Bey 
since u is orthogonal to vt, and 
ae 
w v= sw 
vt -wt =v}. (we u To 5 
u vt. v 
usw vt. w i 
=v + w— v~ o V 
u'u vtu 
usw 
=v--w- vi-u—vi-w = 0 
u'u 


because u is orthogonal to v+. Since w+ is orthogonal to both u and v+, we 


have that {u, v+, wt} is an orthogonal basis for span{u, v, w}. 
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14.4.1 The Gram-Schmidt Procedure 


In fact, given a set {v1,V2,...} of linearly independent vectors, we can define 
an orthogonal basis for span{v,, v2,...} consisting of the following vectors: 


UL = Ui 
Uz, -U2 
Up t= UVe- vI 
vi “Ul 
L L 
L Vz °U3 j; Ua UB 
Ua f= 03 — vy — 
3 1 Eyra 
vy: vi Vz * V2 
hie es epee) ee he ee Uj-1* Ui a 
i Be A EL Pee a le es De PE haps LI “i-l 
Uy vi V2 Ui * Vi- 


Notice that each v} here depends on v} for every j < i. This allows us to 
inductively/algorithmically build up a linearly independent, orthogonal set 
of vectors {v}, vt,...} such that span{v}, vt,...} = span{v1, v2,...}. That 
is, an orthogonal basis for the latter vector space. This algorithm is called 
the Gram-Schmidt orthogonalization procedure-Gram worked at a Danish 
insurance company over one hundred years ago, Schmidt was a student of 
Hilbert (the famous German mathmatician). 


Example 123 We'll obtain an orthogonal basis for R? by appling Gram-Schmidt to 


1 1 3 
the linearly independent set 4 vı = [1] ,v2= [1] ,v3= [1 
0 1 1 
First, we set vp := vı. Then: 
1) 2 (1) = (0 
U9 — — — m 

1 2 0 1 
i 3 4 1 1 0 1 
U3 = 1 = z 1 = I 0 — —1 
1 0 1 0 


242 


14.5 QR Decomposition 243 


Then the set 


1 0 
1],{0],{—1 
0 1 0 


is an orthogonal basis for R?. To obtain an orthonormal basis, as always we simply 
divide each of these vectors by its length, yielding: 


=e 4 
ae {2 
APEZ 
0 1 0 
rN A 4x4 Gram--Schmidt Example 


14.5 QR Decomposition 


In chapter 7, section 7.7 we learned how to solve linear systems by decom- 
posing a matrix M into a product of lower and upper triangular matrices 


M=LU. 
The Gram-Schmidt procedure suggests another matrix decomposition, 
M=QR, 


where Q is an orthogonal matrix and R is an upper triangular matrix. So- 
called QR-decompositions are useful for solving linear systems, eigenvalue 
problems and least squares approximations. You can easily get the idea 
behind the QR decomposition by working through a simple example. 


Example 124 Find the QR decomposition of 


2 —1 1 
M=į|1 3 —2 
0 1 -2 


What we will do is to think of the columns of M as three 3-vectors and use Gram- 
Schmidt to build an orthonormal basis from these that will become the columns of 
the orthogonal matrix Q. We will use the matrix R to record the steps of the Gram- 
Schmidt procedure in such a way that the product QR equals M. 
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To begin with we write 


7 1 
2-2 1 z 0 
M=|1 # -2 1 0 
0 1 -2/ \0 01 


In the first matrix the first two columns are orthogonal because we simpy replaced the 
second column of M by the vector that the Gram-Schmidt procedure produces from 
the first two columns of M, namely 


— —1 2 
5 

l4 j| — 3 Je 

3 5 
1 1 0 


The matrix on the right is almost the identity matrix, save the +: in the second entry 
of the first row, whose effect upon multiplying the two matrices precisely undoes what 
we we did to the second column of the first matrix. 

For the third column of M we use Gram-Schmidt to deduce the third orthogonal 
vector 


1 7 
—4 1 \ 4 a: 

1] _— 14 

Sle ep OE eee | es 
—% -2 0 g 1 


and therefore, using exactly the same procedure write 


1 1 
2 -£ -s\ /1 4 0 
pe l4 1 5 
M=|1 # f]J]o1 -2 
0 1-7) \o 0 1 


This is not quite the answer because the first matrix is now made of mutually orthog- 
onal column vectors, but a bona fide orthogonal matrix is comprised of orthonormal 
vectors. To achieve that we divide each column of the first matrix by its length and 
multiply the corresponding row of the second matrix by the same amount: 


2V5 7/30 V6 V5 
5 90 18 v5 5 0 
_ V5 7/30 V6 3/30 v30 | — 
M 5 45 9 0 5 2 QR. 
/30 7/6 6 
0 8B wt 0 0 xy 


A nice check of this result is to verify that entry (i, j) of the matrix R equals the dot 
product of the i-th column of Q with the j-th column of M. (Some people memorize 
this fact and use it as a recipe for computing QR deompositions.) A good test of 
your own understanding is to work out why this is true! 
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rN Another QR decomposition example 


14.6 Orthogonal Complements 


Let U and V be subspaces of a vector space W. In review exercise 2 you are 
asked to show that UNV is a subspace of W, and that UUV is not a subspace. 
However, span(U U V) is certainly a subspace, since the span of any subset 
of a vector space is a subspace. Notice that all elements of span(U UV) take 
the form u +v with u € U and v € V. We call the subspace 


U +V :=span(U UV) = {u+ vlu €U,v eV} 


the sum of U and V. Here, we are not adding vectors, but vector spaces to 
produce a new vector space! 


Definition Given two subspaces U and V of a space W such that 
UNV = {0w}, 
the direct sum of U and V is defined as: 
U V = span(U UV) = {u + vļu € U,v E V}. 
Remark When UNV = {0w}, U +V =U V. 


The direct sum has a very nice property: 


Theorem 14.6.1. Ifw E UV then there is only one way to write w as 
the sum of a vector in U and a vector in V. 


Proof. Suppose that u + v = u’ +v’, with u, u’ € U, and v, v’ € V. Then we 
could express 0 = (u — u’) + (v — v’). Then (u — u’) = —(v —v’). Since U 
and V are subspaces, we have (u — u’) € U and —(v — v’) € V. But since 
these elements are equal, we also have (u— u’) € V. Since UNV = {0}, then 
(u — u’) = 0. Similarly, (v — v’) = 0. Therefore u = u’ and v = v’, proving 
the theorem. 


ON Reading homework: problem 3 
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Given a subspace U in W, how can we write W as the direct sum of U 
and something? There is not a unique answer to this question as can be seen 
from this picture of subspaces in W = R?: 


However, using the inner product, there is a natural candidate U+ for this 
second subspace as shown here: 


The general definition is as follows: 


Definition Given a subspace U of a vector space W, define: 


Ut = {we Wlw-u=0 for allueU}. 


Remark The set UŁ (pronounced “U-perp” ) is the set of all vectors in W orthogonal 
to every vector in U. This is also often called the orthogonal complement of U. 
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Possibly by now you are feeling overwhelmed, it may help to watch this quick 


overview video: 
G Overview 


Example 125 Consider any plane P through the origin in RÌ. Then P is a subspace, 
and P+ is the line through the origin orthogonal to P. For example, if P is the 
xy-plane, then 


R?=P@ePt= {(z, y,0)|z, y E€ R} @ {(0,0, z)|z € R}. 


Theorem 14.6.2. Let U be a subspace of a finite-dimensional vector space W. 


Then the set Ut is a subspace of W, and W =U Ut. 


Proof. First, to see that U+ is a subspace, we only need to check closure, 
which requires a simple check: Suppose v,w € U+, then we know 


veu=O0=w-u (WED). 
Hence 
=> u: (av + Bw) =au-v+fu-w=0 (Vue U), 


and so av + Bw € Ut. 
Next, to form a direct sum between U and U L we need to show that 
U N UŁ = {0}. This holds because if u € U and u € U+ it follows that 


uru=O0Su=0. 


Finally, we show that any vector w € W is in U @U+. (This is where 
we use the assumption that W is finite-dimensional.) Let e),...,e, be an 
orthonormal basis for W. Set: 


u = (weijer +: + (Wen) en EU, 
ub = w-u. 
It is easy to check that ut € U+ (see the Gram-Schmidt procedure). Then 
w = u + ut, so w € U GU, and we are done. 


"a Reading homework: problem 4 
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Example 126 Consider any line L through the origin in R*. Then L is a sub- 
space, and L+ is a 3-dimensional subspace orthogonal to L. For example, let L = 
span{(1,1,1,1)} be a line in R*. Then L+ is given by 
L+ = {(a,y,z,w)|«,y,z,w € R and (z,y, z,w) + (1,1,1,1) = 0} 
= {(z,y, z, w) | T, Y, Z, W E R and T, Y, Z, W = 0}. 


It is easy to check that 


1 1 1 
—1 0 0 

v= 0 TU = zi » U3 = 0 ’ 
0 0 —1 


forms a basis for L+. We use Gram-Schmidt to find an orthogonal basis for L+: 
First, we set vp = vı. Then 


1 1 5 
oa ON ety a a 
Ba es We aN) ee eae 
0 0 0 
1 1 l 3 
1 
Ea tHe e a ae 
o} 2] o| 32-1 1 
3 
—1 0 0 =i 


So the set 


(1,—1,0,0) Be fee a 
? ike ? 9°92? ’ TEN BP Qa 


is an orthogonal basis for L+. We find an orthonormal basis for L+ by dividing each 
basis vector by its length: 


{Goa} Gaa) (Eee) 


Moreover, we have 


R* = L@L = {(c,c,c,c) |c E€ R} {(z,y, z, w) | x,y,z, w E R, rt+y+z+w = 0}. 


Notice that for any subspace U, the subspace (U+)+ is just U again. As 
such, L is an involution on the set of subspaces of a vector space. (An invo- 
lution is any mathematical operation which performed twice does nothing.) 
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14.7 Review Problems 


Reading Problems len, 2604, 3604, 4604 
Gram-Schmidt 5 
WEP WORK: Orthogonal eigenbasis 6,7 
Orthogonal complement 8 


o {rr 0 
1. Let D= (‘ a 


(a) Write D in terms of the vectors e; and e2, and their transposes. 


(b) Suppose P = (: ) is invertible. Show that D is similar to 


_ 1 ee = Abe —(Ay = A2)ab 
E ad — bc (Ay = A2)cd — bc + Asad l 


(c) Suppose the vectors (a, b) and (c, d) are orthogonal. What can 
you say about M in this case? (Hint: think about what MT is 


equal to.) 
2. Suppose S = {v1,..., Un} is an orthogonal (not orthonormal) basis 
for R”. Then we can write any vector v as v = J`; cv; for some 


constants c’. Find a formula for the constants c’ in terms of v and the 


vectors in S. 
An Hint 


3. Let u,v be linearly independent vectors in RÌ, and P = span{u, v} be 
the plane spanned by u and v. 


all 


a) Is the vector v 


( 

(b 
(c 
(d 


:= v — 7u in the plane P? 


What is the (cosine of the) angle between v+ and u? 


How can you find a third vector perpendicular to both u and v+? 


Ga So hs NLA 


Construct an orthonormal basis for R? from u and v. 
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(e) Test your abstract formule starting with 


“= (1, 2,0) and v = (0, 1,1). 


rN Hint [5] 


4. Find an orthonormal basis for R* which includes (1,1,1,1) using the 
following procedure: 


(a) Pick a vector perpendicular to the vector 
1 
fl 
v = 1 
1 


from the solution set of the matrix equation 


vTr=0. 


Pick the vector v obtained from the standard Gaussian elimina- 
tion procedure which is the coefficient of x2. 


(b) Pick a vector perpendicular to both vı and v2 from the solutions 
set of the matrix equation 


T 
Uy 
= 0. 


Pick the vector v3 obtained from the standard Gaussian elimina- 
tion procedure with 23 as the coefficient. 


(c) Pick a vector perpendicular to v1, v2, and v3 from the solution set 
of the matrix equation 


| x=0. 


Pick the vector v4 obtained from the standard Gaussian elimina- 
tion procedure with x3 as the coefficient. 
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(d) Normalize the four vectors obtained above. 


5. Use the inner product 


Page f f(a)g(a)de 


on the vector space V = span(1, x, x”, £3) to perform the Gram-Schmidt 
procedure on the set of vectors {1, x, x7, x3}. 
6. (a) Show that if Q is an orthogonal n x n matrix then 


u +v = (Qu) + (Qu), 


for any u,v € R”. That is, Q preserves the inner product. 


(b) Does Q preserve the outer product? 


(c) If {w1,..., un} is an orthonormal set and {\j,--- , An} is a set of 
numbers then what are the eigenvalues and eigenvectors of the 
matrix M = doy, Awwiul? 


(d) How does Q change this matrix? How do the eigenvectors and 
eigenvalues change? 


7. Carefully write out the Gram-Schmidt procedure for the set of vectors 


1 1 1 
Lig! Side). 4 
1 1 —1 


Are you free to rescale the second vector obtained in the procedure to 
a vector with integer components? 


8. (a) Suppose u and v are linearly independent. Show that u and vt 
are also linearly independent. Explain why {u, v+} are a basis for 


span{u, v}. 
an Hint 


(b) Repeat the previous problem, but with three independent vectors 
U, U, W. 
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10. 


11. 


12. 


13. 


. Find the QR factorization of 


Given any three vectors u, v, w, when do vt or wt of the Gram-Schmidt 
procedure vanish? 


For U a subspace of W, use the subspace theorem to check that U+ is 
a subspace of W. 


Let S„ and An define the space of n x n symmetric and anti-symmetric 
matrices respectively. These are subspaces of the vector space MẸ of 
all n x n matrices. What is dim M7, dim Sn, and dim A,,? Show that 
M? = Sn + An. Is M? = Sn O An? 


The vector space V = span{sin(t), sin(2t), sin(3t), sin(3t)} has an inner 
product: 


pe J FOOL 


Find the orthogonal compliment to U = span{sin(t) + sin(2t)} in V. 
Express sin(t) — sin(2t) as the sum of vectors from U and UT. 
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Symmetric matrices have many applications. For example, if we consider the 
shortest distance between pairs of important cities, we might get a table like 
this: 

| Davis Seattle San Francisco 


Davis 0 2000 80 
Seattle 2000 0 2010 
San Francisco | 80 2010 0 


Encoded as a matrix, we obtain: 


0 2000 80 
M = | 2000 0 2010] =M". 
80 2010 0 


Definition A matrix is symmetric if it obeys 


M = M". 


One very nice property of symmetric matrices is that they always have 
real eigenvalues. Review exercise 1 guides you through the general proof, but 
here’s an example for 2 x 2 matrices: 


253 


254 Diagonalizing Symmetric Matrices 


Example 127 For a general symmetric 2 x 2 matrix, we have: 


a b A-a —b 
n (; a = aet (*, ra) 
= (A-a) (à-d) -b 
= X? —(a+d)à-b? +ad 


d —d\? 
> = oe E pae P 


2 2 


Notice that the discriminant 4b? + (a — d)? is always positive, so that the eigenvalues 
must be real. 


Now, suppose a symmetric matrix M has two distinct eigenvalues À # u 
and eigenvectors x and y: 


Mz = Az, My = py. 


Consider the dot product x+y = zty = yz and calculate: 


ai My = x py =po-y, and 

a’ My = (y’Mzx)’ (by transposing a 1 x 1 matrix) 
= a? MTy 

a’ My 

a? dy 

= E-y. 


Subtracting these two results tells us that: 
0 = a’ My—2' My =(u—A)e-y. 


Since u and A were assumed to be distinct eigenvalues, A — u is non-zero, 
and so x: y = 0. We have proved the following theorem. 


Theorem 15.0.1. Eigenvectors of a symmetric matrix with distinct eigen- 
values are orthogonal. 


7 Reading homework: problem 1 
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Example 128 The matrix M = k J has eigenvalues determined by 


1 2 
det(M — AF) = (2 — à)? -1 = 0. 


So the eigenvalues of M are 3 and 1, and the associated eigenvectors turn out to be 


1 1 : . ; 
1 and _1): It is easily seen that these eigenvectors are orthogonal: 


In chapter 14 we saw that the matrix P built from any orthonormal basis 
(U1,-.-;Un) for R” as its columns, 


P= (v ae Un) 
was an orthogonal matrix: 
P~ = P, or PP = I = PYP. 


Moreover, given any (unit) vector xı, one can always find vectors z232, ..., Zn 
such that (£1, ..., n) is an orthonormal basis. (Such a basis can be obtained 
using the Gram-Schmidt procedure.) 

Now suppose M is a symmetric n x n matrix and A; is an eigenvalue with 
eigenvector x (this is always the case because every matrix has at least one 
eigenvalue-see review problem 3). Let the square matrix of column vectors 
P be the following: 


P= (xı Gq +e Tn); 


where xı through £n are orthonormal, and xı is an eigenvector for M, but 
the others are not necessarily eigenvectors for M. Then 


MP = (a1 Mro a Mzn). 
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But P is an orthogonal matrix, so P~! = PT. Then: 


zi 


Poep = 
T 
ty, * 
apip = midge * 


LTM * o x 


* 


ri 
0 


* 


The last equality follows since P’ MP is symmetric. The asterisks in the 
matrix are where “stuff” happens; this extra information is denoted by M 
in the final expression. We know nothing about M except that it is an 
(n — 1) x (n — 1) matrix and that it is symmetric. But then, by finding an 
(unit) eigenvector for M , we could repeat this procedure successively. The 
end result would be a diagonal matrix with eigenvalues of M on the diagonal. 
Again, we have proved a theorem: 


Theorem 15.0.2. Every symmetric matrix is similar to a diagonal matrix 
of its eigenvalues. In other words, 


M = M? s M = PDP” 


where P is an orthogonal matrix and D is a diagonal matrix whose entries 
are the eigenvalues of M. 


<7 Reading homework: problem 2 
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To diagonalize a real symmetric matrix, begin by building an orthogonal 
matrix from an orthonormal basis of eigenvectors: 


Example 129 The symmetric matrix 
2 1 
w= (19) 
in ee 1 1 
has eigenvalues 3 and 1 with eigenvectors 1 and 4 respectively. After normal- 


izing these eigenvectors, we build the orthogonal matrix: 


Notice that PT P = I. Then: 


sl i dA pa y 
_ {v2 v2) [x2 v2 
v2 v2 v2 v2 
In short, MP = DP, so D = P'MP. Then D is the diagonalized form of M 
and P the associated change-of-basis matrix from the standard basis to the basis of 


eigenvectors. 
rN 3 x 3 Example 


15.1 Review Problems 


Reading Problems Len, 2604, 


Wepworr Diagonalizing a symmetric matrix 3, 4 


1. (On Reality of Eigenvalues) 


(a) Suppose z = x + iy where z,y € R,i = /—1, and Z = x — iy. 
Compute zZ and Zz in terms of x and y. What kind of numbers 
are zz and Zz? (The complex number Z is called the complex 
conjugate of z). 


(b) Suppose that A = x + iy is a complex number with x,y € R, and 
that A = A. Does this determine the value of x or y? What kind 
of number must A be? 
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(c) 


2. Let 


Let x = 7} € Cc”. Let xt = (z2! vee 27) eC™ (alxn 


g” 


complex matrix or a row vector). Compute a‘. Using the result 
of part la, what can you say about the number 2‘? (E.g., is it 
real, imaginary, positive, negative, etc.) 

Suppose M = MT is an n x n symmetric matrix with real entries. 
Let À be an eigenvalue of M with eigenvector z, so Mx = Ax. 
Compute: 

xz Mr 
mig 


Suppose A is a 1 x 1 matrix. What is AT? 
What is the size of the matrix x’ Mz? 


For any matrix (or vector) N, we can compute N by applying 
complex conjugation to each entry of N. Compute (at)’. Then 


compute (ziMzx)". Note that for matrices AB + C = AB + C. 


Show that \ = à. Using the result of a previous part of this 
problem, what does this say about A? 


AN Hint 


tı = b j 
C 


where a? +b? +c? = 1. Find vectors x2 and x3 such that {x1, £2, £3} 
is an orthonormal basis for R3. What can you say about the matrix P 
whose columns are the vectors 21, £2 and x3 that you found? 


linear 


3. Let V 5v #0 be a vector space, dimV = n and L: V —> V. 


(a) Explain why the list of vectors (v, Lv, L?v,..., L”v) is linearly 


dependent. 
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(b) Explain why there exist scalars a; not all zero such that 
Qov ta,Lu +agLl?u+---+a,L"v =0. 


(c) Let m be the largest integer such that &m 4 0 and 


p(z) = A + a12 + Q22 +--+ $Am2"2”. 
Explain why the polynomial p(z) can be written as 
p(z) = Om(z — A1)(z — Az)... (2 — Àm) - 


[Note that some of the roots A; could be complex.| 


(d) Why does the following equation hold 


(e) Explain why one of the numbers à; (1 < i < m) must be an 
eigenvalue of L. 


4. (Dimensions of Eigenspaces) 


(a) Let 
4 0 0 
A=]ļ|0 2 -2 
0—2 2 


Find all eigenvalues of A. 


(b) Find a basis for each eigenspace of A. What is the sum of the 
dimensions of the eigenspaces of A? 


(c) Based on your answer to the previous part, guess a formula for the 
sum of the dimensions of the eigenspaces of a real n x n symmetric 
matrix. Explain why your formula must work for any real n x n 
symmetric matrix. 


5. If M is not square then it can not be symmetric. However, MMT and 
MTM are symmetric, and therefore diagonalizable. 


(a) Is it the case that all of the eigenvalues of MMT must also be 
eigenvalues of MTM? 
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(b) 
(c) 


Given an eigenvector of M MT how can you obtain an eigenvector 
of MTM? 


Let 
1 2 
M=|3 3 
2 1 


Compute an orthonormal basis of eigenvectors for both MMT 
and MTM. If any of the eigenvalues for these two matrices agree, 
choose an order for them and us it to help order your orthonor- 
mal bases. Finally, change the input and output bases for the 
matrix M to these ordered orthonormal bases. Comment on what 
you find. (Hint: The result is called the Singular Value Decompo- 
sition Theorem.) 
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Given a linear transformation 
L:V>W, 
we want to know if it has an inverse, i.e., is there a linear transformation 
M:W >V 
such that for any vector v € V, we have 
MLv=v, 
and for any vector w € W, we have 
LMw=w. 


A linear transformation is just a special kind of function from one vector 
space to another. So before we discuss which linear transformations have 
inverses, let us first discuss inverses of arbitrary functions. When we later 
specialize to linear transformations, we’ll also find some nice ways of creating 
subspaces. 

Let f: S — T be a function from a set S to a set T. Recall that S is 
called the domain of f, T is called the codomain or target of f, and the set 


ran(f) =im(f) = f(S) = {f(s)ls E€ S} CT, 


is called the range or image of f. The image of f is the set of elements of T 
to which the function f maps, t.e., the things in T which you can get to by 
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Figure 16.1: For the function f : S —> T, S is the domain, T is the tar- 
get/codomain, f(S) is the image/range and f~'(U) is the preimage of 
UCT. 


starting in S and applying f. We can also talk about the pre-image of any 
subset U C T: 
fU) = {s €S|f(s) €U} CS. 


The pre-image of a set U is the set of all elements of S which map to U. 

The function f is one-to-one if different elements in S always map to 
different elements in T. That is, f is one-to-one if for any elements x 4 y € S, 
we have that f(x) 4 f(y): 


One-to-one functions are also called injective functions. Notice that injectiv- 
ity is a condition on the pre-images of f. 
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The function f is onto if every element of T is mapped to by some element 
of S. That is, f is onto if for any t € T, there exists some s € S such that 
f(s) =t. Onto functions are also called surjective functions. Notice that 
surjectivity is a condition on the image of f: 


S T=£(S) 


If f is both injective and surjective, it is bijective: 


S 


T=f£(S) 


Theorem 16.0.1. A function f: S > T has an inverse function g: T > S 
if and only if it is bijective. 


Proof. This is an “if and only if” statement so the proof has two parts: 


1. (Existence of an inverse = bijective.) 


Suppose that f has an inverse function g. We need to show f is bijec- 
tive, which we break down into injective and surjective: 


e The function f is injective: Suppose that we have s,s’ € S such 
that f(x) = f(y). We must have that g(f(s)) = s for any s € S, so 
in particular g(f(s)) = s and g(f(s’)) = s’. But since f(s) = f(s’), 
we have g(f(s)) = g(f(s’)) so s = s’. Therefore, f is injective. 
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e The function f is surjective: Let t be any element of T. We must 
have that f(g(t)) = t. Thus, g(t) is an element of S which maps 
to t. So f is surjective. 


2. (Bijectivity = existence of an inverse.) Suppose that f is bijective. 
Hence f is surjective, so every element t € T has at least one pre- 
image. Being bijective, f is also injective, so every t has no more than 
one pre-image. Therefore, to construct an inverse function g, we simply 
define g(t) to be the unique pre-image f~1(t) of t. 


Now let us specialize to functions f that are linear maps between two 
vector spaces. Everything we said above for arbitrary functions is exactly 
the same for linear functions. However, the structure of vector spaces lets 
us say much more about one-to-one and onto functions whose domains are 
vector spaces than we can say about functions on general sets. For example, 
we know that a linear function always sends Oy to Ow, i.e., 


f(Ov) = Ow 


In review exercise 2, you will show that a linear transformation is one-to-one 
if and only if Oy is the only vector that is sent to Ow: In contrast to arbitrary 
functions between sets, by looking at just one (very special) vector, we can 
figure out whether f is one-to-one! 

Let L: V — W be a linear transformation. Suppose L is not injective. 
Then we can find vı 4 v2 such that Lu; = Lus. So vı — v2 # 0, but 


L(vy, = V2) = 0. 


Definition Let L: V — W be a linear transformation. The set of all vec- 
tors v such that Lv = Ow is called the kernel of L: 


ker L = {v € V| Lu = 0w} C V. 
Theorem 16.0.2. A linear transformation L is injective if and only if 


ker L = {Oy}. 
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Proof. The proof of this theorem is review exercise 2. 


Notice that if L has matrix M in some basis, then finding the kernel of L 
is equivalent to solving the homogeneous system 


MX =0. 


Example 130 Let L(x, y) = (x+y,x + 2y,y). Is L one-to-one? 
To find out, we can solve the linear system: 


1 1/0 1 0/0 
1 2;0]~ {0 1/0 
O 1/0 0 0/0 


Then all solutions of MX = 0 are of the form x = y = 0. In other words, ker L = {0}, 
and so L is injective. 


"apa Reading homework: problem 1 


linear 


Theorem 16.0.3. Let L: V ——> W. Then ker L is a subspace of V. 


Proof. Notice that if L(v) = 0 and L(u) = 0, then for any constants c, d, 
L(cu+dv) = 0. Then by the subspace theorem, the kernel of L is a subspace 
of V. 


Example 131 Let L: R? — R be the linear transformation defined by L(x, y, z) = 
(x+y+z). Then ker L consists of all vectors (x,y,z) € R3 such that z +y +z = 0. 
Therefore, the set 

V ={(2,y,z) ER? |r +y+z=0} 


is a subspace of R°. 


When L : V — V, the above theorem has an interpretation in terms of 
the eigenspaces of L: Suppose L has a zero eigenvalue. Then the associated 
eigenspace consists of all vectors v such that Lv = Ov = 0; in other words, 
the 0-eigenspace of L is exactly the kernel of L. 

In the example where L(x, y) = (x + y,x + 2y, y), the map L is clearly 
not surjective, since L maps R? to a plane through the origin in RÌ. But any 
plane through the origin is a subspace. In general notice that if w = L(v) 
and w’ = L(v'), then for any constants c, d, linearity of L ensures that 


cw + dw’ = L(cv + dv’). 


Now the subspace theorem strikes again, and we have the following theorem: 
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Theorem 16.0.4. Let L: V — W. Then the image L(V) is a subspace 
of W. 


Example 132 Let L(x,y) = (x +y,z + 2y,y). The image of L is a plane through 
the origin and thus a subspace of R°. Indeed the matrix of L in the standard basis is 
1 1 
1 2 
0 1 

The columns of this matrix encode the possible outputs of the function L because 


1 1 r 1 
L(x,y)= |1 2 Ge. 1] +y [2 
ee a 1 


=l) 


Hence, when bases and a linear transformation is are given, people often refer to its 
image as the column space of the corresponding matrix. 


To find a basis of the image of L, we can start with a basis S = {v1,..., Un} 
for V. Then the most general input for L is of the form atv; + -< + avn. 
In turn, its most general output looks like 


L(a'v ++ aUn) = al Lvu; +- +a” Lun € span{ Lv, ... Lun}. 


Thus 

L(V) = span L(S) = span{Lvj,..., Lun}. 
However, the set {Lv,,..., Lun} may not be linearly independent; we must 
solve 


ct Lu t- + eLo =0, 


to determine whether it is. By finding relations amongst the elements of 
L(S) = {Lvı,..., Lun}, we can discard vectors until a basis is arrived at. 
The size of this basis is the dimension of the image of L, which is known as 
the rank of L. 


Definition The rank of a linear transformation L is the dimension of its 
image, written 
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rank L = dim L(V) = dim ran L. 


The nullity of a linear transformation is the dimension of the kernel, written 


null L = dim ker L. | 


Theorem 16.0.5 (Dimension Formula). Let L: V —> W be a linear trans- 
formation, with V a finite-dimensional vector space!. Then: 


dim V = dimkerV + dim L(V) 


= null. + rank L. 
Proof. Pick a basis for V: 
Tis .. . , Up, U1,- - - Ug}, 
where v1, ...,Up is also a basis for ker L. This can always be done, for exam- 


ple, by finding a basis for the kernel of L and then extending to a basis for V. 
Then p = null L and p +q = dim V. Then we need to show that q = rank L. 
To accomplish this, we show that {L(u1),..., L(ug)} is a basis for L(V). 

To see that {Z(ui),..., Z(u,)} spans L(V), consider any vector w in L(V). 
Then we can find constants c’, dî such that: 

w = Mdu +--+: + cup + dtu; +--+ + dug) 

cL (v1) +++» +L (up) +d Llu) + +++ + d Llu) 
d'L(u,) +--+ +d?4L(uq) since L(v;) = 0, 
span{L(uz),..., L(uq)}. 

Now we show that {L(uz),...,Z(uq)} is linearly independent. We argue 
by contradiction: Suppose there exist constants dî (not all zero) such that 

0 = d'L(u) +++: +d*L (ug) 
L(d'u +--+ dug). 


=> L(V) 


1The formula still makes sense for infinite dimensional vector spaces, such as the space 
of all polynomials, but the notion of a basis for an infinite dimensional space is more 
sticky than in the finite-dimensional case. Furthermore, the dimension formula for infinite 
dimensional vector spaces isn’t useful for computing the rank of a linear transformation, 
since an equation like co = co + x cannot be solved for x. As such, the proof presented 
assumes a finite basis for V. 
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But since the u’ are linearly independent, then diu +---+d%uq Æ 0, and 
so d'u, + +--+ d%u, is in the kernel of L. But then d'u, +--+ + d%uq must 
be in the span of {v1,..., Up}, since this was a basis for the kernel. This 
contradicts the assumption that {v1,...,Up,W1,...,Uq} was a basis for V, so 
we are done. 


ray, Reading homework: problem 2 


16.1 Summary 


We have seen that a linear transformation has an inverse if and only if it is 
bijective (7.e., one-to-one and onto). We also know that linear transforma- 
tions can be represented by matrices, and we have seen many ways to tell 
whether a matrix is invertible. Here is a list of them: 


Theorem 16.1.1 (Invertibility). Let M be an n x n matrix, and let 


L: R” > R” 


be the linear transformation defined by L(v) = Mv. Then the following 
statements are equivalent: 


1. If V is any vector in R”, then the system MX = V has exactly one 
solution. 


2. The matrix M is row-equivalent to the identity matrix. 


3. Ifv is any vector in R”, then L(x) = v has exactly one solution. 
4. The matric M is invertible. 

5. The homogeneous system MX =0 has no non-zero solutions. 

6. The determinant of M is not equal to 0. 

7. The transpose matric MT is invertible. 

8. The matrix M does not have 0 as an eigenvalue. 


9. The linear transformation L does not have 0 as an eigenvalue. 
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10. 
11. 
12. 
13. 
14. 
15. 


16. 


The characteristic polynomial det(AI — M) does not have 0 as a root. 


The columns (or rows) of M span R". 


The columns (or rows) of M are linearly independent. 
The columns (or rows) of M are a basis for R”. 

The linear transformation L is injective. 

The linear transformation L is surjective. 


The linear transformation L is bijective. 


Note: it is important that M be an n x n matrix! If M is not square, 
then it can’t be invertible, and many of the statements above are no longer 
equivalent to each other. 


Proof. Many of these equivalences were proved earlier in other chapters. 
Some were left as review questions or sample final questions. The rest are 
left as exercises for the reader. 


An Invertibility Conditions 


16.2 Review Problems 


Webwork: 


Reading Problems 

Elements of kernel 
Basis for column space 

Basis for kernel 
Basis for kernel and image 
Orthonomal image basis 
Orthonomal kernel basis 
Orthonomal kernel and image bases 
Orthonomal kernel, image and row space bases 
Rank 


1, 2, 


O cont DD oe WwW 


= = 
=. © 


1. Consider an arbitrary matrix M : R” —> R”. 
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(a) Argue that Max = 0 if only if x is perpendicular to all columns 
of MT. 


(b) Argue that Ma = 0 if only if x is perpendicular to all of the linear 
combinations of the columns of MT. 


(c) Argue that ker M is perpendicular to ran M7. 
(d) Argue further R™ = ker M @ ran MT. 
(e) Argue analogously that R” = ker MT @ran M. 


The equations in the last two parts describe how a linear transforma- 
tion M : R™ — R” determines orthogonal decompositions of both it’s 
domain and target. This result sometimes goes by the humble name 
The Fundamental Theorem of Linear Algebra. 


2. Let L: V — W be a linear transformation. Show that ker L = {Ov} if 
and only if L is one-to-one: 


(a) (Trivial kernel => injective.) Suppose that ker L = {0y}. Show 
that L is one-to-one. Think about methods of proof—does a proof 
by contradiction, a proof by induction, or a direct proof seem most 
appropriate? 

(b) (Injective = trivial kernel.) Now suppose that L is one-to-one. 
Show that ker L = {0y}. That is, show that Oy is in ker L, and 
then show that there are no other vectors in ker L. 


An Hint 


3. Let {v1,..., Un} be a basis for V. Carefully explain why 


L(V) = span{Lv,..., Lun}. 


4. Suppose L: R* — R? whose matrix M in the standard basis is row 
equivalent to the following matrix: 
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(a) Explain why the first three columns of the original matrix M form 
a basis for L(R*). 


(b) Find and describe an algorithm (i.e., a general procedure) for 
computing a basis for L(R”) when L: R” > R”. 

(c) Use your algorithm to find a basis for L(R*) when L: R4 > R? is 
the linear transformation whose matrix M in the standard basis 
is 


2 1 1 4 

0105 

411 6 
5. Claim: 


If {v,,...,Un} is a basis for ker L, where L: V — W, then it 
is always possible to extend this set to a basis for V. 


Choose some simple yet non-trivial linear transformations with non- 
trivial kernels and verify the above claim for those transformations. 


6. Let P,,(a) be the space of polynomials in x of degree less than or equal 
to n, and consider the derivative operator 


d 


aq i Pal) + Pr(e). 


Find the dimension of the kernel and image of this operator. What 
happens if the target space is changed to P,-1(2) or Pa+1(£)? 


Now consider P2(x,y), the space of polynomials of degree two or less 
in z and y. (Recall how degree is counted; «xy is degree two, y is degree 
one and x’y is degree three, for example.) Let 


o o 
DaF T eae): 


(For example, L(xy) = 2 (xy) + 5 (xy) =y + zx.) Find a basis for the 
kernel of L. Verify the dimension formula in this case. 


7. Lets demonstrate some ways the dimension formula can break down if 
a vector space is infinite dimensional: 
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(a) 


Let Ra] be the vector space of all polynomials in the variable x 


with real coefficients. Let D = = be the usual derivative operator. 


Show that the range of D is R|z]. What is ker D? 
Hint: Use the basis {x" | n € N}. 


Let L: R[x] R[x] be the linear map 


What is the kernel and range of M? 


Let V be an infinite dimensional vector space and L: V > V bea 
linear operator. Suppose that dim ker L < oo, show that dim L(V) 
is infinite. Also show that when dim L(V) < œœ that dim ker L is 
infinite. 


8. This question will answer the question, “If I choose a bit vector at 
random, what is the probability that it lies in the span of some other 
vectors?” 


i. 


Given a collection S of k bit vectors in B?, consider the bit ma- 
trix M whose columns are the vectors in S. Show that S is linearly 
independent if and only if the kernel of M is trivial, namely the 
set ker M = {v € B?| Mv = 0} contains only the zero vector. 


Give some method for choosing a random bit vector v in B?. Sup- 
pose S$ is a collection of 2 linearly independent bit vectors in B°. 
How can we tell whether SU {v} is linearly independent? Do you 
think it is likely or unlikely that S U {v} is linearly independent? 
Explain your reasoning. 


If P is the characteristic polynomial of a 3 x 3 bit matrix, what 
must the degree of P be? Given that each coefficient must be 
either 0 or 1, how many possibilities are there for P? How many 
of these possible characteristic polynomials have 0 as a root? If M 
isa 3x3 bit matrix chosen at random, what is the probability that 
it has 0 as an eigenvalue? (Assume that you are choosing a random 
matrix M in such a way as to make each characteristic polynomial 
equally likely.) What is the probability that the columns of M 
form a basis for B3? (Hint: what is the relationship between the 
kernel of M and its eigenvalues?) 
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Note: We could ask the same question for real vectors: If I choose a real 
vector at random, what is the probability that it lies in the span 
of some other vectors? In fact, once we write down a reasonable 
way of choosing a random real vector, if I choose a real vector in 

R” at random, the probability that it lies in the span of n — 1 

other real vectors is zero! 
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Least squares and Singular Values 


linear 


Consider the linear system L(x) = v, where L: U ——> W, and v € W is 
given. As we have seen, this system may have no solutions, a unique solution, 
or a space of solutions. But if v is not in the range of L, in pictures: 


there will never be any solutions for L(x) = v. However, for many applica- 
tions we do not need an exact solution of the system; instead, we try to find 
the best approximation possible. 


“My work always tried to unite the Truth with the Beautiful, 
but when I had to choose one or the other, I usually chose the 
Beautiful.” 


— Hermann Weyl. 
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If the vector space W has a notion of lengths of vectors, we can try to 
find x that minimizes ||L(x) — v||: 


L(U) 


This method has many applications, such as when trying to fit a (perhaps 
linear) function to a “noisy” set of observations. For example, suppose we 
measured the position of a bicycle on a racetrack once every five seconds. 
Our observations won’t be exact, but so long as the observations are right on 
average, we can figure out a best-possible linear function of position of the 
bicycle in terms of time. 

Suppose M is the matrix for L in some bases for U and W, and v and x 
are given by column vectors V and X in these bases. Then we need to 
approximate 


MX-V 7x0. 


Note that if dim U = n and dim W = m then M can be represented by 
an m x n matrix and x and v as vectors in R” and R”, respectively. Thus, 
we can write W = L(U) @ L(U)+. Then we can uniquely write v = vl + vt, 
with vl € L(U) and vt € L(U)+. 

Thus we should solve L(u) = vl. In components, v+ is just V — MX, and 
is the part we will eventually wish to minimize. 

In terms of M, recall that L(V) is spanned by the columns of M. (In 
the standard basis, the columns of M are Mey, ..., Men.) Then vt must be 
perpendicular to the columns of M. i.e., MT(V — MX) = 0, or 


M™MX =M’'Y. 


Solutions of MTMX = MTV for X are called least squares solutions to 
MX =V. Notice that any solution X to MX = V isa least squares solution. 
However, the converse is often false. In fact, the equation MX = V may have 
no solutions at all, but still have least squares solutions to M7 MX = MTV. 
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Observe that since M is an m x n matrix, then MT is an n x m matrix. 
Then MTM is an n x n matrix, and is symmetric, since (MTM)? = MTM. 
Then, for any vector X, we can evaluate XTMTMX to obtain a num- 
ber. This is a very nice number, though! It is just the length |MX|? = 
(MX)?(MX) = XTMTMX. 


ON Reading homework: problem 1 


Now suppose that ker L = {0}, so that the only solution to MX = 0 is 
X = 0. (This need not mean that M is invertible because M is an n x m 
matrix, so not necessarily square.) However the square matrix MTM is 
invertible. To see this, suppose there was a vector X such that M7MX = 0. 
Then it would follow that XTMTMX = |MX/? = 0. In other words the 
vector MX would have zero length, so could only be the zero vector. But we 
are assuming that ker L = {0} so MX = 0 implies X = 0. Thus the kernel 
of MTM is {0} so this matrix is invertible. So, in this case, the least squares 
solution (the X that solves M? MX = MV) is unique, and is equal to 


X =(M™M)'M'V. 
In a nutshell, this is the least squares method: 
e Compute MTM and MTV. 
e Solve (M7M)X = MTV by Gaussian elimination. 


Example 133 Captain Conundrum falls off of the leaning tower of Pisa and makes 
three (rather shaky) measurements of his velocity at three different times. 


Having taken some calculus!, he believes that his data are best approximated by 
a straight line 
v=att+ob. 


lIn fact, he is a Calculus Superhero. 
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Then he should find a and b to best fit the data. 


11 = a-1+0 
19 = a-24+0 
31 = a-3+0. 


As a system of linear equations, this becomes: 


wi ; 
21 GE 19 
3 1 31 


There is likely no actual straight line solution, so instead solve M7 MX = MTV. 


11 
Gide G)=G 7 3) [2 
3 1 31 


This simplifies to the system: 


14 6 |142 1 0]10 
6 3| 61 o 1f j’ 
Thus, the least-squares fit is the line 


Wea- 
v= =e 
3 


Notice that this equation implies that Captain Conundrum accelerates towards Italian 
soil at 10 m/s? (which is an excellent approximation to reality) and that he started at 
a downward velocity of i m/s (perhaps somebody gave him a shove...)! 


17.1 Singular Value Decomposition 


Suppose 


linear 


L:V—w. 
It is unlikely that dim V := n = m =: dim W so the m x n matrix M of L 
in bases for V and W will not be square. Therefore there is no eigenvalue 
problem we can use to uncover a preferred basis. However, if the vector 


spaces V and W both have inner products, there does exist an analog of the 
eigenvalue problem, namely the singular values of L. 
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Before giving the details of the powerful technique known as the singular 
value decomposition, we note that it is an excellent example of what Eugene 
Wigner called the “Unreasonable Effectiveness of Mathematics”: 


There is a story about two friends who were classmates in high school, talking about 
their jobs. One of them became a statistician and was working on population trends. He 
showed a reprint to his former classmate. The reprint started, as usual with the Gaussian 
distribution and the statistician explained to his former classmate the meaning of the 
symbols for the actual population and so on. His classmate was a bit incredulous and was 
not quite sure whether the statistician was pulling his leg. “How can you know that?” 
was his query. “And what is this symbol here?” “Oh,” said the statistician, this is “7.” 
“And what is that?” “The ratio of the circumference of the circle to its diameter.” “Well, 
now you are pushing your joke too far,” said the classmate, “surely the population has 
nothing to do with the circumference of the circle.” 


Eugene Wigner, Commun. Pure and Appl. Math. XIII, 1 (1960). 


Whenever we mathematically model a system, any “canonical quantities” 
(those on which we can all agree and do not depend on any choices we make 
for calculating them) will correspond to important features of the system. 
For examples, the eigenvalues of the eigenvector equation you found in re- 
view question 1, chapter 12 encode the notes and harmonics that a guitar 
string can play! Singular values appear in many linear algebra applications, 
especially those involving very large data sets such as statistics and signal 
processing. 

Let us focus on the mxn matrix M of a linear transformation L : V > W 
written in orthonormal bases for the input and outputs of L (notice, the 
existence of these othonormal bases is predicated on having inner products for 
V and W). Even though the matrix M is not square, both the matrices MM? 
and MTM are square and symmetric! In terms of linear transformations MT 
is the matrix of a linear transformation 


linear 


L* : W—>V 


Thus LL* : W — W and L*L : V > V and both have eigenvalue problems. 
Moreover, as we learned in chapter 15, both L*L and LL* have orthonormal 
bases of eigenvectors, and both MMT and MTM can be diagonalized. 

Next, let us make a simplifying assumption, namely ker L = {0}. This 
is not necessary, but will make some of our computations simpler. Now 
suppose we have found an orthonormal basis (u1, ..., Un) for V composed of 
eigenvectors for L* L: 
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Hence, multiplying by L, 


Ie., Lu; is an eigenvector of LL*. The vectors (Lu,..., Lun) are linearly 
independent, because ker L = {0} (this is where we use our simplifying as- 
sumption, but you can try and extend our analysis to the case where it no 
longer holds). Lets compute the angles between, and lengths of these vectors: 

For that we express the vectors u; in the bases used to compute the matrix 
M of L . Denoting these column vectors by U; we then compute 


(MU;,) - (MU,) =U} M* MU, = å; UFU; = A; Ui Uj = Ajôij . 


Hence we see that vectors (Lu,..., Lun) are orthogonal but not orthonormal. 
Moreover, the length of Lu; is A;. Thus, normalizing lengths, we have that 


( Lui Lun ) 

ER 

are orthonormal and linearly independent. However, since ker L = {0} we 
have dim L(V) = dim V and in turn dim V < dim W, so n < m. This means 
that although the above set of n vectors in W are orthonormal and linearly 
independent, they cannot be a basis for W. However, they are a subset of 
the eigenvectors of LL*. Hence an orthonormal basis of eigenvectors of LL* 


looks like 
Lu Lun 
= (Spee Fa Umom) i eae en 


Now lets compute the matrix of L with respect to the orthonormal basis 
O = (u1,..., Un) for V and the orthonormal basis O’ = (v1,...,Um) for W. 
As usual, our starting point is the computation of L acting on the input basis 


vectors: 
(Lu,..., Lun) = (Arneis Aeta) 
Và 0 >- 
0 VÀz es 
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The result is very close to diagonalization; the numbers vÀ; along the leading 
diagonal are called the singular values of L. 


Example 134 Let the matrix of a linear transformation be 


Clearly ker M = {0} while 


which has eigenvalues and eigenvectors 


“a: 
satin (F); azza 
4 _ 


so our orthonormal input basis is 


(QD) 


These are called the right singular vectors of M. The vectors 


1 
Fs 0 
Mu, = 0 and Mus = | —V2 
1 
T 0 


are eigenvectors of 
1 0 

2 

T 

MM = 0 2 

1 
-} 0 
with eigenvalues 1 and 2, respectively. The third eigenvector (with eigenvalue 0) of 

M is 

i 
V2 
v3 = 0 
gle. 
V2 
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The eigenvectors Mu; and Mug are necessarily orthogonal, dividing them by their 
lengths we obtain the left singular vectors and in turn our orthonormal output basis 


1 0 l 

v2 v2 

C= o],/ -1],] 0 
-1 4 

v2 i v2 


The new matrix M’ of the linear transformation given by M with respect to the bases 
O and O’ is 


1 0 
M' = 0 V2 ’ 
0 0 


so the singular values are 1, /2. 
Finally note that arranging the column vectors of O and O’ into change of basis 
matrices 


1 0 + 

1 1 V2 v2 

V2 V2 _ 

P= ( fi f Q= 0 1 0f, 

V2 V2 1 go + 

v2 v2 
we have, as usual, 
M'=Q°'MP. 


Singular vectors and values have a very nice geometric interpretation: 
they provide an orthonormal bases for the domain and range of L and give 
the factors by which L stretches the orthonormal input basis vectors. This 
is depicted below for the example we just computed: 


y 


T 
uj [L vi 
© ©- 
u7 


R2 
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17.2 Review Problems 


Webwork: | Reading Problem | 1, 


Congratulations, you have reached the end of the book! 


Now test your skills on the sample final exam. 


1. Let L: U > V bea linear transformation. Suppose v € L(U) and you 
have found a vector ups that obeys L(ups) = v. 


Explain why you need to compute ker L to describe the solution set of 


the linear system L(u) = v. 


rN Hint 


2. Suppose that M is an m x n matrix with trivial kernel. Show that for 


any vectors u and v in R”: 


eu MT Mv =v! MT Mu. 


e v? M? Mv > 0. Incase you are concerned (you don’t need to be) 
and for future reference, the notation v > 0 means each component 


vt >Q. 
e Ifv? MTMv = 0, then v = 0. 


(Hint: Think about the dot product in 


rN Hint 


R”.) 
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List of Symbols 


“Is an element of”. 


“Is equivalent to”, see equivalence relations. 
Also, “is row equivalent to” for matrices. 


The real numbers. 
The n x n identity matrix. 


The vector space of polynomials of degree at most n with 
coefficients in the field F. 


The vector space of r x k matrices. 
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Fields 


Definition A field F is a set with two operations + and -, such that for all 
a,b,c € F the following axioms are satisfied: 


Al. Addition is associative (a+b) +c=a+(b+c). 

A2. There exists an additive identity 0. 

A3. Addition is commutative a+ b = b + a. 

A4. There exists an additive inverse —a. 

M1. Multiplication is associative (a - b) -c =a- (b-c). 

M2. There exists a multiplicative identity 1. 

M3. Multiplication is commutative a -b =b- a. 

M4. There exists a multiplicative inverse a7! if a Æ 0. 

D. The distributive law holds a- (b + c) = ab + ac. 

Roughly, all of the above mean that you have notions of +, —, x and + just 


as for regular real numbers. 

Fields are a very beautiful structure; some examples are rational num- 
bers Q, real numbers R, and complex numbers C. These examples are in- 
finite, however this does not necessarily have to be the case. The smallest 
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example of a field has just two elements, Zə = {0,1} or bits. The rules for 
addition and multiplication are the usual ones save that 


1+1=0. 
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Here 


Online Resources 


are some internet places to get linear algebra help: 
Strang’s MIT Linear Algebra Course. Videos of lectures and more: 


http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/ 


Beezer’s online Linear Algebra Course 


http://linear.ups.edu/version3. html 


The Khan Academy has thousands of free videos on a multitude of 
topics including linear algebra: 


http://www. khanacademy.org/ 


The Linear Algebra toolkit: 


http://www.math.odu.edu/~bogacki/lat/ 


Carter, Tapia and Papakonstantinou’s online linear algebra resource 


http://ceee.rice.edu/Books/LA/index. html 


S.O.S. Mathematics Matrix Algebra primer: 


http: //www.sosmath.com/matrix/matrix.html 


The Numerical Methods Guy on Youtube. Lots of worked examples: 
289 


290 


Online Resources 


http://www. youtube. com/user/numericalmethodsguy 


Interactive Mathematics. Lots of useful math lessons on many topics: 


http://www.intmath.com/ 


Stat Trek. A quick matrix tutorial for statistics students: 


http://stattrek.com/matrix-algebra/matrix.aspx 


Wolfram’s Mathworld. An online mathematics encyclopedia: 


http://mathworld.wolfram.com/ 


Paul Dawkin’s online math notes: 


http://tutorial.math.lamar.edu/ 


Math Doctor Bob: 


http://www. youtube. com/user/MathDoctorBob?feature=watch 


Some pictures of how to rotate objects with matrices: 


http://people.cornellcollege.edu/dsherman/visualize-matrix.html 


xkcd. Geek jokes: 


http://xkcd.com/184/ 


See the bridge actually fall down: 


http: //anothermathgeek. hubpages. com/hub/What-the-Heck-are-EFigenvalues-and-Eigenvectors 
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Sample First Midterm 


Here are some worked problems typical for what you might expect on a first 
midterm examination. 


1. Solve the following linear system. Write the solution set in vector form. 
Check your solution. Write one particular solution and one homogeneous 
solution, if they exist. What does the solution set look like geometrically? 


+ 3y =4 

— 2y + z =1 

2x yY z =5 

2. Consider the system 

x =. 2 + 2w = -l 
£ y 4 z w 2 
— y= 2R + w = -3 
5x 2y z 4w = 1 


(a) Write an augmented matrix for this system. 
(b) Use elementary row operations to find its reduced row echelon form. 


(c) Write the solution set for the system in the form 


S = {Xo + X mY; : m ER} 
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Tet M = ( 


(d) What are the vectors Xo and Y; called and which matrix equations do 
they solve? 


(e) Check separately that Xo and each Y; solve the matrix systems you 
claimed they solved in part (d). 


. Use row operations to invert the matrix 


1 2 3 4 
2 4 7 ii 
3 7 14 25 
4 11 25 50 


2. 1 
3-1 
trace of the transpose of f(M), where f(x) = z? — 1? 


iF Calculate M7 M~!. Is M symmetric? What is the 


. In this problem M is the matrix 


cos@ sind 
MS : sin 0 2) 


and X is the vector 


Calculate all possible dot products between the vectors X and MX. Com- 
pute the lengths of X and MX. What is the angle between the vectors MX 
and X. Draw a picture of these vectors in the plane. For what values of 0 
do you expect equality in the triangle and Cauchy—Schwartz inequalities? 


. Let M be the matrix 


100 1 0 0 
0 10 0 1 0 
00100 1 
000 1 0 0 
000 0 1 0 
000 00 1 


Find a formula for M* for any positive integer power k. Try some simple 
examples like k = 2,3 if confused. 


. Determinants: The determinant det M of a 2 x 2 matrix M = & 3) is 


defined by 
det M = ad — bc. 
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8. 


10. 


(a) For which values of det M does M have an inverse? 


(b) Write down all 2 x 2 bit matrices with determinant 1. (Remember bits 
are either 0 or 1 and 1 +1 = 0.) 


(c) Write down all 2 x 2 bit matrices with determinant 0. 


(d) Use one of the above examples to show why the following statement is 
FALSE. 


Square matrices with the same determinant are always row 
equivalent. 


What does it mean for a function to be linear? Check that integration is a 
linear function from V to V, where V = {f : R > R | f is integrable} is a 
vector space over R with usual addition and scalar multiplication. 


. What are the four main things we need to define for a vector space? Which 


of the following is a vector space over R? For those that are not vector 
spaces, modify one part of the definition to make it into a vector space. 


(a) V = { 2 x 2 matrices with entries in R}, usual matrix addition, and 


a b ka b 
(s pee y ) br kER 


(b) V = {polynomials with complex coefficients of degree < 3}, with usual 
addition and scalar multiplication of polynomials. 


(c) V = {vectors in R3 with at least one entry containing a 1}, with usual 
addition and scalar multiplication. 


Subspaces: If V is a vector space, we say that U is a subspace of V when the 
set U is also a vector space, using the vector addition and scalar multiplica- 
tion rules of the vector space V. (Remember that U C V says that “U isa 
subset of V”, i.e., all elements of U are also elements of V. The symbol V 
means “for all” and € means “is an element of”.) 


Explain why additive closure (u + w € U V u,v € U) and multiplicative 
closure (r.u E€ U V r € R, u € V) ensure that (i) the zero vector 0 € U and 
(ii) every u € U has an additive inverse. 


In fact it suffices to check closure under addition and scalar multiplication 
to verify that U is a vector space. Check whether the following choices of U 
are vector spaces: 
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x 
(a) U = y|:x,yER 
0 
1 
(b) U = Of:zER 
z 
Solutions 
1. As an additional exercise, write out the row operations above the ~ signs 
below: 
1 3 0/4 1 30] 4 0 3/4 
-2 1/1 |~| 0 -5 11-3 | ~ 1 -i| 2 
2 1 An) 0 -5 1|-—3 0o 0i 0 
Solution set 
11 3 
Žž 0 1 
11 
5 
Geometrically this represents a line in R? through the point 3 and 
0 
_3 
5 
running parallel to the vector t 
1 
11 3 
; -i 
A particular solution is 3 and a homogeneous solution is 5 
0 1 
As a double check note that 
1 30 = 4 1 30 E 0 
1-2 1 3) =[1] and | 1 -2 1 z|=|0 
2 1 1 0 5 2 1 1 1 0 


2. (a) Again, write out the row operations as an additional exercise. 
I 0 =i. - Bela 
1 1 1-1} 2 
0 -1 -2 3)-8 
5 
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(b) 
1 0 -l 2|-1 1 0 -l 2|—-1 
0 1 2 -3}] 3 01 2 -3 
0 -1 -2 3/-3 00 0 0 
0 2 4 -6] 6 00 0 OF; 0 
(c) Solution set 
=l 1 =2 
3 —2 3 
X= 0 + ui 1 + u2 0 : 41, y2 ER 
0 0 1 
—1 
(d) The vector Xo = ; is a particular solution and the vectors Yj = 
0 
1 —2 
—2 3 : : 
1 and Y = o| are homogeneous solutions. Calling M = 
0 1 
1 0 -l 2 =] 
1 1 J l 
0-1 2 3 and V = 3 , they obey 
5 2 ml 4 1 


MX=V, MY, =0= MY. 


(e) This amounts to performing explicitly the matrix manipulations M X — 
V, MY,, MY and checking they all return the zero vector. 


3. As usual, be sure to write out the row operations above the ~’s so your work 
can be easily checked. 


1 2 3 ANG 000 
2 4 7110100 
3 7 14 2/001 0 
4 11 25 50/0001 
12 3 4| 1000 
00 1 3/2 100 

“| o1 5 13|-3010 
0 3 13 34|—4 0 0 1 


295 


296 Sample First Midterm 


1 0 -7 —22 7 0 —2 0 
xs 0 1 5 131-3 0 1 0 
0 0 1 3} -2 1 0 0 
0 0 -2 —-5 5 0 -3 1 
1 0 0 -1)|-7 7 —2 0 
0 1 0 -2 7 —5 1 0 
0 0 1 3 | —2 1 0 0 
0 0 0 1 1 2 —3 1 
1 0 0 0|—6 9 —5 1 
2 0 1 0 0 9 -1 —5 
0 0 1 0;-5 —5 9 —3 
000 1 1 2 —3 1 
Check 
Io <De 23. 4 —6 9 —5 1 1 0 0 0 
2 4 7 1l 9 -1 —5 2} 1/0100 
3 7 14 25 —5 —5 9 -3] 1001 0 
4 11 25 50 1 2 -3 1 0 00 1 


-aD 


Since MT MT! Æ I, it follows MT 4 M so M is not symmetric. Finally 
trf(M)? =trf(M) = tr(M? — I) = tr i) € 5) — trI 


= (2-2+1-3)+(3-1+(-1)-(-1))-2=9. 


5. First 
X+(MX) = X™MX = (x »( cos 0 =) o) 


—sin@ cos) \y 


= xcos#+ysin@\ , 2 2 
=(¢ 9) Gee ee bey eee: 


Now ||X|| = VX : X = yz? +y? and (MX): (MX) = XMTMX. But 
T,, _ (cos@ —sind cos@ sin 
eras te an « sin cos 
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_ (cos? + sin? 0 0 S 
E 0 cos?6+sin?6) `’ 


Hence ||MX|| = ||X|| = vx? + y?. Thus the cosine of the angle between X 
and MX is given by 


X+(MX) (x? + y?) cos @ 


= =cosé@. 
IXIL IMXI] y2 + y? yr? + y? 


In other words, the angle is 0 OR —6@. You should draw two pictures, one 
where the angle between X and MX is 0, the other where it is —0. 


= IX*(MX)| _ _ = 
For Cauchy—Schwartz, IXTIMXI = |cos@| = 1 when 0 = 0,7. For the 


triangle equality MX = X achieves ||X + MX]|| = ||X|| + ||MX||, which 
requires 9 = 0. 


6. This is a block matrix problem. Notice the that matrix M is really just 
I I : f ; ; 
M = ( ) , where J and 0 are the 3x3 identity zero matrices, respectively. 


0 I 
But 

I IN/I I E21 

2 = 
m a a Ja J 
I I\ ff 2I I 3I 

3 a 
M ae ate ae à 


so, M* = G a or explicitly 


and 


100k 00 
0100k 0 
5A PO Ls OU k 
M'S Fy 00100 
00001 0 
000001 


7. (a) Whenever detM = ad — bc £ 0. 


(b) Unit determinant bit matrices: 


ob a) -Ga)-G 0) Go) Ga): 
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(c) Bit matrices with vanishing determinant: 
0 0 1 0 0 1 0 0 0 0 
0 0/’\0 OJ °\0 O7 U OF’?\0 1)? 
1 1 0 0 1 O 0 1 1 1 
0 OP Pa 0 O Pd 1j 


As a check, count that the total number of 2x2 bit matrices is 2° 
2° = 16: 

(d) To disprove this statement, we just need to find a single counterexam- 
ple. All the unit determinant examples above are actually row equiva- 


lent to the identity matrix, so focus on the bit matrices with vanishing 
determinant. Then notice (for example), that 


(oo) #(0 0). 


So we have found a pair of matrices that are not row equivalent but 
do have the same determinant. It follows that the statement is false. 


number of entries) __ 


8. We can call a function f: V —> W linear if the sets V and W are vector 
spaces and f obeys 


flau + pv) = af (u) + 8f (v), 


for all u,v E€ V and a,b E R. 


Now, integration is a linear transformation from the space V of all inte- 
grable functions (don’t be confused between the definition of a linear func- 
tion above, and integrable functions f(x) which here are the vectors in V) 
to the real numbers R, because [%_(af(£) + Bg(x))dz = a f™. f(x)dx + 


BSS g(x)dx. 


9. The four main ingredients are (i) a set V of vectors, (ii) a number field K 
(usually K = R), (iii) a rule for adding vectors (vector addition) and (iv) 
a way to multiply vectors by a number to produce a new vector (scalar 
multiplication). There are, of course, ten rules that these four ingredients 
must obey. 


(a) This is not a vector space. Notice that distributivity of scalar multi- 
plication requires 2u = (1+ 1)u = u + u for any vector u but 


(C=C @) 


298 


299 


which does not equal 


EEDE 2). 


This could be repaired by taking 


p. (0 &\ _ (ka kb 
c d) \ke kd) ` 


(b) This is a vector space. Although, the question does not ask you to, it is 
a useful exercise to verify that all ten vector space rules are satisfied. 


(c) This is not a vector space for many reasons. An easy one is that 
(1,—1,0) and (—1,1,0) are both in the space, but their sum (0, 0, 0) is 
not (i.e., additive closure fails). The easiest way to repair this would 
be to drop the requirement that there be at least one entry equaling 1. 


10. (i) Thanks to multiplicative closure, if u € U, so is (—1)-u. But (—1)-u+u= 
(—1)-u+1-u = (—1+1)-u = 0.u = 0 (at each step in this chain of equalities 
we have used the fact that V is a vector space and therefore can use its vector 
space rules). In particular, this means that the zero vector of V is in U and 
is its zero vector also. (ii) Also, in V, for each u there is an element —u 
such that u+ (—u) = 0. But by additive close, (—u) must also be in U, thus 
every u € U has an additive inverse. 


x 
(a) This is a vector space. First we check additive closure: let | y | and 
0 
z x z L+z 
w | be arbitrary vectors in U. But since | y |+| w] = [y+w I], 
0 0 0 0 
so is their sum (because vectors in U are those whose third component 
T 
vanishes). Multiplicative closure is similar: for any a € R, a | y | = 
0 
ax 
ay |, which also has no third component, so is in U. 
0 
(b) This is not a vector space for various reasons. A simple one is that 
1 2 
u= | 0 | isin U but the vectoru+u= | 0] is not in U (it has a 2 
z 2z 


in the first component, but vectors in U always have a 1 there). 
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Here are some worked problems typical for what you might expect on a second 
midterm examination. 


1. Find an LU decomposition for the matrix 


1 1 —1 2 
1 3 2 2 
-1 -3 -4 6 
4 7 -2 


Use your result to solve the system 


Be Se Aye ee sh Dw Sr 
x + 3y + 22 + 2w =6 
T 3y 4z 6w =12 
4y + Tz — Ww =-7 
2. Let 
Ete 4 
A= 2 2 
456 
1 
Compute det A. Find all solutions to (i) AX = 0 and (ii) AX = | 2 | for 
the vector X € R3. Find, but do not solve, the characteristic ae of 
A. 


301 


302 


Sample Second Midterm 


3. Let M be any 2 x 2 matrix. Show 


1 t 
det M = — tr" + 5 (trM)*. 


4. The permanent: Let M = (MŻ) be ann xn matrix. An operation producing 


j 
a single number from M similar to the determinant is the “permanent” 


perm M = $ Maa) Mee) Moin) 


oO 


For example 


perm (: i) =ad+ be. 


Calculate 
1 2 3 
permļ|4 5 6 
7 8 9 


What do you think would happen to the permanent of an n x n matrix M 
if (include a brief explanation with each answer): 


(a) You multiplied M by a number A. 
( 


) 
b) You multiplied a row of M by a number 4. 
(c) 

) 


(d) You swapped two rows of M. 


You took the transpose of M. 


. Let X be an n x 1 matrix subject to 


XTX =(1), 


and define 
H=I-2XxX7, 


(where J is the n x n identity matrix). Show 


H=H'T=H7-}, 


. Suppose A is an eigenvalue of the matrix M with associated eigenvector v. 


Is v an eigenvector of M* (where k is any positive integer)? If so, what 
would the associated eigenvalue be? 


Now suppose that the matrix N is nilpotent, i.e. 
Nt =0 


for some integer k > 2. Show that 0 is the only eigenvalue of N. 
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3 —5 
7. Let M = ( ) Compute M!’. (Hint: 21? = 4096.) 
8. The Cayley Hamilton Theorem: Calculate the characteristic polynomial 


10. 


Pu(A) of the matrix M = ; 


Pum(M). What do you observe? Now suppose the nxn matrix A is “similar” 
to a diagonal matrix D, in other words 


) Now compute the matrix polynomial 


A= PHDP 


for some invertible matrix P and D is a matrix with values 1, A2,...An 
along its diagonal. Show that the two matrix polynomials P4(A) and P4(D) 
are similar (i.e. P4(A) = P~!P4(D)P). Finally, compute P4(D), what can 
you say about P4(A)? 


. Define what it means for a set U to be a subspace of a vector space V. 


Now let U and W be non-trivial subspaces of V. Are the following also 
subspaces? (Remember that U means “union” and N means “intersection” .) 
(a) UUW 
(b) UAW 


In each case draw examples in R? that justify your answers. If you answered 
“ves” to either part also give a general explanation why this is the case. 


Define what it means for a set of vectors {v1,v2,...,Un} to (i) be linearly 
independent, (ii) span a vector space V and (iii) be a basis for a vector 
space V. 


Consider the following vectors in R? 


=] 4 10 

u= |4], v= [5], w= f 

3 0 h+3 

For which values of h is {u,v,w} a basis for RÌ? 
Solutions 
1. 

1 1 -l 2 1000 1 1 -l 2 
ae: aes (a ee a KC a 
-1 -3 -4 6] |-1 01 0 0 -2 -5 8 
0 4 7 -2 000 1 0 4 7 -2 
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1 00 0\ fl 1-1 2 
|1 100|j02 3 0 
~ {-1 -1 1 0] [0 0 -2 8 

0 1/ \o 0 1 -2 

1 0 0 0\ fl 1 -1 2 
|1 1 00|j02 30 
~ {-1 -1 1 0] [0 0 -2 8 

0 2-3; 1/ \0 0 0 2 


To solve MX = V using M = LU we first solve LW = V whose augmented 
matrix reads 


1 0 0 0l7 10 0 0l7 
1 1 0 0/6 01 0 ug a4 
-1 -1 1 012 |~| 0o00 1 0/18 
0 2 -$ 1ļ-7 0 2 -5 1|-7 
1 000|7 
01 0 0ļl-1 
“ooi olis f’ 
0001] 4 


from which we can read off W. Now we compute X by solving UX = W 
with the augmented matrix 


1: a eh ody 1 1 -1 Ol 4 
02 3 0ļ|-1 02 3 0ļ|-1 
0 0 -2 8/18 0 0 -2 0] 2 
00 0 2) 4 00 0 1| 2 
1 1 -1 2]7 1 000|1 
02 0 | 0100l1 

“ioo 1ı ol- |~]|o001 toy a 
00 0 1) 2 C004 2 


So z = 1, y = 1, z = —1 and w = 2. 


detA = 1.(2.6 — 3.5) — 1.(2.6 — 3.4) + 1.(2.5 — 2.4) = —1. 
(i) Since detA 4 0, the homogeneous system AX = 0 only has the solution 
X =0. (ii) It is efficient to compute the adjoint 
T 


-3 0 2 z5 =i 
adj A=|-1 2 -1] =| 0 2 -1 
1 -1 0 2 —1 0 
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Hence 
3 1 -1 
At=|{ 0 -2 
—2 1 
Thus 
3 1 — 1 
X={ 0 - 2) =[-1 
—2 1 3 0 
Finally, 
1-— 1 1 
P(A) = — det 2 2-xX 3 
4 5 6-A 
= [a ya — A)(6 — A) — 15] — [2.(6 — A) — 0E] 
=P SON Sa, 
3. Call M = 6 J Then detM = ad — bc, yet 
1 oe! 2 1, (a? +be * 1 2 


= -z(a + 2be + d?) + Sa +2ad + d?) = ad — bc, 


which is what we were asked to show. 


12 3 
perm |4 5 6] =1.(5.9 + 6.8) + 2.(4.9 + 6.7) + 3.(4.8 + 5.7) = 450. 
7 8 9 


(a) Multiplying M by A replaces every matrix element Mi, j in the formula 
for the permanent by AM} 4); and therefore produces an overall factor 
A”. 

(b) Multiplying the i** row by A replaces Mii in the formula for the 
permanent by AM}, jy Therefore the permanent is multiplied by an 
overall factor À. 
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(c) The permanent of a matrix transposed equals the permanent of the 
original matrix, because in the formula for the permanent this amounts 
to summing over permutations of rows rather than columns. But we 
could then sort the product MO ag) atl ue) back into its original 
order using the inverse permutation o~!. But summing over permuta- 
tions is equivalent to summing over inverse permutations, and therefore 
the permanent is unchanged. 


(d) Swapping two rows also leaves the permanent unchanged. The argu- 
ment is almost the same as in the previous part, except that we need 
only reshuffle two matrix elements M? and Mi; j) (in the case where 
rows 7 and j were swapped). Then we use the fact that summing over 
all permutations o or over all permutations o obtained by swapping a 
pair in o are equivalent operations. 


5. Firstly, lets call (1) = 1 (the 1 x 1 identity matrix). Then we calculate 


H® = (I—2X XT)! = IT —2(X XT)? = 1-2(X7T)? XT =1-2XXT =H, 
which demonstrates the first equality. Now we compute 

H? = (I —OX XL — 2XXT) = I —4XXT +4XXTXXT 

=I —4XXT +4X(XTX)XT = I —4XXT +4X.1.XT =I. 
So, since HH = I, we have H~! = H. 


. We know Mv = Av. Hence 


M?v = MMv = Mv = A\Mv = ’0, 


and similarly 
M*v = dAMP tw =... = dv. 


So v is an eigenvector of M* with eigenvalue \*. 
Now let us assume v is an eigenvector of the nilpotent matrix N with eigen- 
value A. Then from above 
N*v = Nv 
but by nilpotence, we also have 
N*'y =0 
Hence \*v = 0 and v (being an eigenvector) cannot vanish. Thus \* = 0 


and in turn A = 0. 
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7. Let us think about the eigenvalue problem Mv = Av. This has solutions 


when 
o 3-A —5\ \2 NSA 
0 = det ( 1 eE -4>,=2. 


The associated eigenvalues solve the homogeneous systems (in augmented 
matrix form) 


1 —5 1/0 1 —51/0 aad 5 —5|0 1 -1/0 
1 —5 1/0 0 00 1 —1/0 0 007? 
. 5 1 12 12 
respectively, so are vg = 1 and v- = i): Hence M*“vg = 2°*vq and 


12, _ — (_9)12 £) ay (5) aby (1 i . 
Mtv = (—2)"*v_2. Now, (5) =a (7) 7 C) (this was obtained 


by solving the linear system av2 + bu_2 = for a and b). Thus 


£ £— Yy x — 5y 
M = M Mv_ 
(3) = Sot Mey — tes 


— — öy x 
=22(= y T 7 jee 
i eee ai y 


4096 0 
T2e 
LO, ( 0 ise) 


Thus 


If you understand the above explanation, then you have a good understanding 


of diagonalization. A quicker route is simply to observe that M? = é > 


Pu Sayre ic : à fe spene abe: 


Thus 
Py(M) = (M — al)(M — dI) — bcI 


“(C6 a) d(C a -G a) Go a) 
Cate) )-(5 ee 


Observe that any 2 x 2 matrix is a zero of its own characteristic polynomial 
(in fact this holds for square matrices of any size). 
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Now if A = P-!DP then A? = P-!DPP-!DP = P-!D?P. Similarly 
AF = P-!D*P. So for any matrix polynomial we have 


A” + GA” ++++Cp-1At ent 


PUD PAG RD? Pp 4 oP OOP + cen P!P 
PHD” + aD! +---en-1.D+en1)P. 


Thus we may conclude P4(A) = P~!P4(D)P. 


Xe > sri 
0 A» 

Now suppose D= | . . Then 
0 eee. Oe 


P(A) = det(AI — A) = det(AP~'IP — P~'DP) = detP.det(AI — D).detP 


A- AL 0 ee 0 
= det(AI — D) = det am i 
0 0 "A À is An 
E e 
Thus we see that à1, A2,..., An are the eigenvalues of M. Finally we compute 


P4(D) = (D O E OE 


0 0 . 0 ` 0- 0 ` 0> 0 
0 A2 0 0 0 0 0 A2 0 

= . . . . . oes . . b =0. 
0 0 Sore. Àn 0 0 o An 070- 0 


We conclude the Pyy(M) = 0. 


. A subset of a vector space is called a subspace if it itself is a vector space, 


using the rules for vector addition and scalar multiplication inherited from 
the original vector space. 


(a) So long as U A UUW # W the answer is no. Take, for example, U 
to be the x-axis in R? and W to be the y-axis. Then (1,0) € U and 
(0,1) € W, but (1,0) + (0,1) = (1,1) g UUW. So UUW is not 
additively closed and is not a vector space (and thus not a subspace). 
It is easy to draw the example described. 
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(b) Here the answer is always yes. The proof is not difficult. Take a vector 
u and w such that u € UN W 3 w. This means that both u and w 
are in both U and W. But, since U is a vector space, au + Bw is also 
in U. Similarly, au + Gw € W. Hence au + Bw € U NW. So closure 
holds in U N W and this set is a subspace by the subspace theorem. 
Here, a good picture to draw is two planes through the origin in R° 
intersecting at a line (also through the origin). 


10. (i) We say that the vectors {v1,v2,...Un} are linearly independent if there 

exist no constants c!, c?,...c” (all non-vanishing) such that ctv, + c?v2 + 
+--+ cv, = 0. Alternatively, we can require that there is no non-trivial 
solution for scalars c!, c?,...,c” to the linear system cv; + c?vg +--+ + 
Cun = 0. (ii) We say that these vectors span a vector space V if the set 
span{v1,V2,...Un} = {clv + ev +--+ + eu: cl, c?,...c? ER} = V. (iii) 
We call {v1, v2,...Un} a basis for V if {v1, v2,... Un } are linearly independent 


and span{v1, v2,... Un} = V. 


For u,v, w to be a basis for R3, we firstly need (the spanning requirement) 


x 
that any vector | y | can be written as a linear combination of u, v and w 
Z 
—1 4 10 £ 
d|-4|] +5] +e 7 =ly 
3 0 h+3 z 


The linear independence requirement implies that when z = y = z = 0, the 
only solution to the above system is c! = c? = c? = 0. But the above system 
in matrix language reads 


—1 4 10 c! 
-4 5 7 eC) = 
3 0 h+3 3 


x eR 


Both requirements mean that the matrix on the left hand side must be 
invertible, so we examine its determinant 


-1 4 10 
det | -4 5 7 | =—4.(—4.(h +3) — 7.3) + 5.(—1.(h + 3) — 10.3) 
3 0 h+3 


= 11(h — 3). 


Hence we obtain a basis whenever h £ 3. 
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Sample Final Exam 


Here are some worked problems typical for what you might expect on a final 
examination. 


1. Define the following terms: 


An orthogonal matriz. 

A basis for a vector space. 

The span of a set of vectors. 

The dimension of a vector space. 

An eigenvector. 

A subspace of a vector space. 

The kernel of a linear transformation. 


The nullity of a linear transformation. 


) 
) 
) 
) 
) 
) 
) 
) 
(i) The image of a linear transformation. 
) The rank of a linear transformation. 
) The characteristic polynomial of a square matrix. 
) An equivalence relation. 
) A homogeneous solution to a linear system of equations. 
) A particular solution to a linear system of equations. 
) The general solution to a linear system of equations. 
) 


The direct sum of a pair of subspaces of a vector space. 
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(q) The orthogonal complement to a subspace of a vector space. 


2. Kirchoff’s laws: Electrical circuits are easy to analyze using systems of equa- 


tions. The change in voltage (measured in Volts) around any loop due to 
batteries | and resistors AAAA (given by the product of the current mea- 
sured in Amps and resistance measured in Ohms) equals zero. Also, the sum 
of currents entering any junction vanishes. Consider the circuit 


1 Ohm 2 Ohms 
I Amps 13 Amps J Amps 
i 60 Volts T 80 Volts if V Volts 
3 Ohms 3 Ohms 


Find all possible equations for the unknowns J, J and V and then solve for 
I, J and V. Give your answers with correct units. 


. Suppose M is the matrix of a linear transformation 


L:U>V 
and the vector spaces U and V have dimensions 
dim U =n, dim V =m, 
and 
MEN. 
Also assume 
kerL = {0y}. 

) How many rows does M have? 

) How many columns does M have? 

) Are the columns of M linearly independent? 
(d) What size matrix is MTM? 

) What size matrix is MMT? 

) Is MTM invertible? 

) 


is MTM symmetric? 
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(h 
(i 
(j 
(k 


Is MTM diagonalizable? 

Does MTM have a zero eigenvalue? 

Suppose U = V and ker L ¥ {0y}. Find an eigenvalue of M. 
Suppose U = V and ker L ¥ {0y}. Find det M. 


Kr NS RR 


4. Consider the system of equations 


x+yteze+ w = il 
x + 2y + 22 + 2w = 1 
x + 2y + 32 + 3w = 1 


Express this system as a matrix equation MX = V and then find the solution 
set by computing an LU decomposition for the matrix M (be sure to use 
back and forward substitution). 


5. Compute the following determinants 


1 2 3 4 
12 3 
1 2 5 6 7 8 
act (5 1) det : : : , det 9 10 11 12l” 
13 14 15 16 


1 2 3 4 5 

6 7 8 9 10 

det | 11 12 13 14 15 
16 17 18 19 20 

21 22 23 24 25 


Now test your skills on 


1 2 3 © n 

n+1 n+2 n+3 see Qn 

det 2n+1 2n+ 2 2n+3 3n 
MHRA AP aaa n?-n+3 cer n? 


Make sure to jot down a few brief notes explaining any clever tricks you use. 


6. For which values of a does 


1 1 a 
U = span 0], 2/,{/1 =R?? 
1 —3 
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For any special values of a at which U 4 R, express the subspace U as the 
span of the least number of vectors possible. Give the dimension of U for 
these cases and draw a picture showing U inside R. 


7. Vandermonde determinant: Calculate the following determinants 


Be sure to factorize you answers, if possible. 


Challenging: Compute the determinant 


1 x, (x) (x1)"* 
1 ag (a2)? +++ (a)? 
det | 1 z3 (z3) ~~ Gay 
Lam (En)? ++ (En) 


1 
8. (a) Do the vectors 2]|,{2],{0],{1],{0 form a basis for R?? 


Be sure to justify your answer. 


and 


4 
(b) Find a basis for Rt that includes the vectors : 
1 


ew pe ke 


(c) Explain in words how to generalize your computation in part (b) to 
obtain a basis for R” that includes a given pair of (linearly independent) 
vectors u and v. 


9. Elite NASA engineers determine that if a satellite is placed in orbit starting 
at a point O, it will return exactly to that same point after one orbit of the 
earth. Unfortunately, if there is a small mistake in the original location of 
the satellite, which the engineers label by a vector X in R? with origin! at O, 


This is a spy satellite. The exact location of O, the orientation of the coordinate axes 
in R3 and the unit system employed by the engineers are CIA secrets. 
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10. 


after one orbit the satellite will instead return to some other point Y € R?. 
The engineer’s computations show that Y is related to X by a matrix 


1 
oi 1 
i is 2 a 
Y=|o 2 32|% 
1 4 0 


(a) Find all eigenvalues of the above matrix. 
(b) Determine all possible eigenvectors associated with each eigenvalue. 
Let us assume that the rule found by the engineers applies to all subsequent 


orbits. Discuss case by case, what will happen to the satellite if the initial 
mistake in its location is in a direction given by an eigenvector. 


In this problem the scalars in the vector spaces are bits (0,1 with 1+ 1 = 0). 
The space B* is the vector space of bit-valued, k-component column vectors. 
(a) Find a basis for B®. 


(b) Your answer to part (a) should be a list of vectors v1, v2,...Un. What 
number did you find for n? 


(c) How many elements are there in the set B®. 
(d) What is the dimension of the vector space B?. 


(e) Suppose L : B? + B = {0,1} is a linear transformation. Explain why 
specifying L(v,), L(v2),..., L(Un) completely determines L. 


(£) Use the notation of part (e) to list all linear transformations 
L: Be >B. 
How many different linear transformations did you find? Compare your 


answer to part (c). 


(g) Suppose Lı : B? + B and Lə : B? —> B are linear transformations, 
and a and 8 are bits. Define a new map (aL; + L2) : B? > B by 


(aLı + BL2)(v) = aLı (v) + 8La(v). 


Is this map a linear transformation? Explain. 


(h) Do you think the set of all linear transformations from B? to B is a 
vector space using the addition rule above? If you answer yes, give a 
basis for this vector space and state its dimension. 
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11. A team of distinguished, post-doctoral engineers analyzes the design for a 
bridge across the English channel. They notice that the force on the center 
x 
of the bridge when it is displaced by an amount X = | y | is given by 
z 


-z —y 
F=] -—x-2y-2z 
—y-z 


Moreover, having read Newton’s Principiæ, they know that force is propor- 
tional to acceleration so that? 


Since the engineers are worried the bridge might start swaying in the heavy 
channel winds, they search for an oscillatory solution to this equation of the 
form? 
a 
X =cos(wt) | b 


(a) By plugging their proposed solution in the above equations the engi- 
neers find an eigenvalue problem 


a a 
M| b| =-w?1|b 
c 


Here M is a 3 x 3 matrix. Which 3 x 3 matrix M did the engineers 
find? Justify your answer. 


(b) Find the eigenvalues and eigenvectors of the matrix M. 


(c) The number |w] is often called a characteristic frequency. What char- 
acteristic frequencies do you find for the proposed bridge? 


(d) Find an orthogonal matrix P such that MP = PD where D is a 
diagonal matrix. Be sure to also state your result for D. 


?The bridge is intended for French and English military vehicles, so the exact units, 
coordinate system and constant of proportionality are state secrets. 
3Here, a,b,c and w are constants which we aim to calculate. 
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(e) 


Is there a direction in which displacing the bridge yields no force? If 
so give a vector in that direction. Briefly evaluate the quality of this 
bridge design. 


12. Conic Sections: The equation for the most general conic section is given by 


ax” + 2bry + dy? + 2cx + 2ey + f = 0. 


Our aim is to analyze the solutions to this equation using matrices. 


(a) 


Rewrite the above quadratic equation as one of the form 


XTMX + XTOC+CTX+f=0 


relating an unknown column vector X = er its transpose XT, a 
2 x 2 matrix M, a constant column vector C and the constant f. 


Does your matrix M obey any special properties? Find its eigenvalues. 
You may call your answers À and u for the rest of the problem to save 
writing. 


For the rest of this problem we will focus on central conics for 
which the matrix M is invertible. 


Your equation in part (a) above should be be quadratic in X. Recall 
that if m Æ 0, the quadratic equation ma? + 2cr + f = 0 can be 
rewritten by completing the square 


c\2 e 
m(x + <) =——f. 
m m 
Being very careful that you are now dealing with matrices, use the 
same trick to rewrite your answer to part (a) in the form 


YTMY = 4g. 


Make sure you give formulas for the new unknown column vector Y 
and constant g in terms of X, M, C and f. You need not multiply out 
any of the matrix expressions you find. 


If all has gone well, you have found a way to shift coordinates 
for the original conic equation to a new coordinate system 
with its origin at the center of symmetry. Our next aim is 
to rotate the coordinate axes to produce a readily recognizable 
equation. 
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13. 


14. 


(d) Why is the angle between vectors V and W is not changed when you 
replace them by PV and PW for P any orthogonal matrix? 


(e) Explain how to choose an orthogonal matrix P such that MP = PD 
where D is a diagonal matrix. 


(f) For the choice of P above, define our final unknown vector Z by Y = 
PZ. Find an expression for YT MY in terms of Z and the eigenvalues 
of M. 


(g) Call Z = i) What equation do z and w obey? (Hint, write your 
answer using A, u and g.) 


(h) Central conics are circles, ellipses, hyperbolae or a pair of straight lines. 
Give examples of values of (A, u, g) which produce each of these cases. 


Let L: V — W bea linear transformation between finite-dimensional vector 
spaces V and W, and let M be a matrix for L (with respect to some basis 
for V and some basis for W). We know that L has an inverse if and only if 
it is bijective, and we know a lot of ways to tell whether M has an inverse. 
In fact, L has an inverse if and only if M has an inverse: 


(a) Suppose that L is bijective (i.e., one-to-one and onto). 


i. Show that dim V = rank L = dim W. 
ii. Show that 0 is not an eigenvalue of M. 


iii. Show that M is an invertible matrix. 
(b) Now, suppose that M is an invertible matrix. 


i. Show that 0 is not an eigenvalue of M. 
ii. Show that L is injective. 


iii. Show that L is surjective. 


Captain Conundrum gives Queen Quandary a pair of newborn doves, male 
and female for her birthday. After one year, this pair of doves breed and 
produce a pair of dove eggs. One year later these eggs hatch yielding a new 
pair of doves while the original pair of doves breed again and an additional 
pair of eggs are laid. Captain Conundrum is very happy because now he will 
never need to buy the Queen a present ever again! 


Let us say that in year zero, the Queen has no doves. In year one she has 
one pair of doves, in year two she has two pairs of doves etc... Call Fn the 
number of pairs of doves in years n. For example, Fo = 0, Fi = 1 and 
Fə = 1. Assume no doves die and that the same breeding pattern continues 
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15. 


16. 


well into the future. Then F3 = 2 because the eggs laid by the first pair of 
doves in year two hatch. Notice also that in year three, two pairs of eggs are 
laid (by the first and second pair of doves). Thus Fy = 3. 

(a) Compute F; and Fẹ. 


(b) Explain why (for any n > 2) the following recursion relation holds 


Fn = Fn-1 + Fr-2- 


: F, 
(c) Let us introduce a column vector Xn = F ") . Compute X; and Xo. 
n-1 


Verify that these vectors obey the relationship 


Xə = MX, where M = G o) A 


(d) Show that Xn41 = MXn. 


(e) Diagonalize M. (I.e., write M as a product M = PDP! where D is 
diagonal.) 


(f) Find a simple expression for M” in terms of P, D and P7H. 
(g) Show that Xni = M” xX. 


(h) The number 
1+v5 
2 


is called the golden ratio. Write the eigenvalues of M in terms of y. 


g= 


(i) Put your results from parts (c), (f) and (g) together (along with a short 
matrix computation) to find the formula for the number of doves Fp 
in year n expressed in terms of y, 1 — y and n. 


Use Gram-Schmidt to find an orthonormal basis for 
0 


span 


eene 
eR OF 
Ne © 


Let M be the matrix of a linear transformation L : V — W in given bases 
for V and W. Fill in the blanks below with one of the following six vector 
spaces: V, W, kerL, (kerL)~, imL, (imL)~. 
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17. 


(a) The columns of M span in the basis given for 
(b) The rows of M span in the basis given for 
Suppose 
12 1 ia 
Qo iE eels 2 
Me 1 0 0 -1 
4 1-1 0 


is the matrix of L in the bases {v1, v2, v3, v4} for V and {w1, we, w3, wa} 
for W. Find bases for kerL and imL. Use the dimension formula to check 
your result. 


Captain Conundrum collects the following data set 


y| x 
5 | —2 
2 =l 
O} 1 
3} 2 


which he believes to be well-approximated by a parabola 


y =ar? +br+c. 


(a) Write down a system of four linear equations for the unknown coeff- 
cients a, b and c. 

Write the augmented matrix for this system of equations. 

Find the reduced row echelon form for this augmented matrix. 


) 
) 
d) Are there any solutions to this system? 
) Find the least squares solution to the system. 
) 


What value does Captain Conundrum predict for y when « = 2? 


and believe that the result is well modeled by a straight line 


y=mr +b. 
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(a) 
(b) 


Write down a linear system of equations you could use to find the slope 
m and constant term b. 


Arrange the unknowns (m,b) in a column vector X and write your 
answer to (a) as a matrix equation 


MX=V. 


Be sure to give explicit expressions for the matrix M and column vector 
V. 


For a generic data set, would you expect your system of equations to 
have a solution? Briefly explain your answer. 


Calculate MTM and (M7 M)~! (for the latter computation, state the 
condition required for the inverse to exist). 


Compute the least squares solution for m and b. 


The least squares method determines a vector X that minimizes the 
length of the vector V — MX. Draw a rough sketch of the three data 
points in the (a, y)-plane as well as their least squares fit. Indicate how 
the components of V — MX could be obtained from your picture. 


Solutions 


1. You can find the definitions for all these terms by consulting the index of 
this book. 


2. Both junctions give the same equation for the currents 


I+J+13=0. 


There are three voltage loops (one on the left, one on the right and one going 
around the outside of the circuit). Respectively, they give the equations 


60 —I — 80-31 =0 
80+2J-V+3J=0 
60- 7+2J-V+3J-31=0 . (F.1) 


The above equations are easily solved (either using an augmented matrix 
and row reducing, or by substitution). The result is J = —5 Amps, J = —8 
Amps, V = 40 Volts. 


3. (a) 


m. 
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nxn. 
mx mM. 


Yes. This relies on kerM = 0 because if MT M had a non-trivial kernel, 
then there would be a non-zero solution X to M7MX = 0. But then 
by multiplying on the left by XT we see that ||MX|| = 0. This in turn 
implies MX = 0 which contradicts the triviality of the kernel of M. 


Yes because (MTM)” = M?(M?)? = MTM. 

Yes, all symmetric matrices have a basis of eigenvectors. 

No, because otherwise it would not be invertible. 

Since the kernel of L is non-trivial, M must have 0 as an eigenvalue. 


Since M has a zero eigenvalue in this case, its determinant must vanish. 
Le., det M = 0. 


4. To begin with the system becomes 


Then 


x 
1111 1 
y 
Ta 225, Dp a2 = |1 
z 
1 2 3 3 1 
w 
1 1 1 1 100 1 1 1 1 
M=ļ|1i 2 2 2|=ļ|/1 1 0 O 1 1 1 
1 2 3 3 1 0 1 0 1 2 2 
100 1 1 1 1 
=]1 1 0 0 1 1 1)/=LU 
LoL 0 0 1 1 
a 


So now MX = V becomes LW = V where W = UX = | b | (say). Thus 


we solve LW = V by forward substitution 


a=1,a+b=1,a+b+c=1=>a=1,b=0,c=0. 
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Now solve UX = W by back substitution 
t+yt+2z2+uwu=l1l,yt+z2+w=0, z+w=0 


=> w = u (arbitrary), z = —u, y = 0,£ = 1. 
1 


The solution set is 


e x €R 
(a) 


. First 


1 2 
act (3 :) =-2. 


All the other determinants vanish because the first three rows of each matrix 
are not independent. Indeed, 2R2 — Rı = Rg in each case, so we can make 
row operations to get a row of zeros and thus a zero determinant. 


8 


. If U spans R?, then we must be able to express any vector X = | y | € R? 


Va 
1 1 a | la l 
X=c[0]4e] 2|+2ļıļ]=ļ0 2 1 , 
1 =3 0 1 -3 0/ \@ 


for some coefficients ct, c? and c?. This is a linear system. We could solve 
for c!, c? and œ using an augmented matrix and row operations. However, 
since we know that dim R? = 3, if U spans R3, it will also be a basis. Then 
the solution for ct, c? and c? would be unique. Hence, the 3 x 3 matrix above 
must be invertible, so we examine its determinant 


1 l a 
det | 0 2 1) =1.(2.0 — 1.(—3)) + 1.(1.1 — a.2) = 4 — 2a. 
1 -3 0 


Thus U spans R? whenever a 4 2. When a = 2 we can write the third vector 
in U in terms of the preceding ones as 


1 1 
3 1 
1 —3 


(You can obtain this result, or an equivalent one by studying the above linear 
system with X = 0, i.e., the associated homogeneous system.) The two 
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324 
1 2 
vectors 2] and {| 1) are clearly linearly independent, so this is the 
—3 0 


least number of vectors spanning U for this value of a. Also we see that 
dimU = 2 in this case. Your picture should be a plane in R? though the 


1 2 
origin containing the vectors 2) and {1 
—3 0 
7. 
det f oe =y-2z 
1 «a r? 1 £ x 
1 y y =det|0 y—-a y- r? 
1 z 2 0 z=r 2-r 
= (y—a)(2? — 27) — (y? — 27) (z — x) = (y — z)(z — z)(z — y). 
1 gz 2 r? 1 x x r? 
1y yp p| 0 y=r P-e p-r 
ger 1 z 22 z2 =A 0 z=r 22-2? 2-r’ 
1 w w w’ 0 w-a w- r? w- r’ 


1 0 0 0 
= 0 y-e yy-2) y*(y—-z) 
ore 0 (z—a2) 27(z-2) 
0 £ x) 

0 


) 
z=% 2(z—-2) z— 
w-2 w(w-gr) w(w- 
1 0 0 
= 01l y y 
= (y — x)(z — x)(w — x) det gee A 
0 1 w w? 


= (y—a)(z— x)(w — x) (y — x) (z = x)(z = y) . 


From the 4 x 4 case above, you can see all the tricks required for a general 
Vandermonde matrix. First zero out the first column by subtracting the first 
row from all other rows (which leaves the determinant unchanged). Now zero 
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8. 


out the top row by subtracting xı times the first column from the second 
column, x; times the second column from the third column etc. Again these 
column operations do not change the determinant. Now factor out £2 — z1 
from the second row, £3 — xı from the third row, etc. This does change the 
determinant so we write these factors outside the remaining determinant, 
which is just the same problem but for the (n — 1) x (n — 1) case. Iterating 
the same procedure gives the result 


1 zy Gai)? ss Gey? 
1 g2 (a2)? +++ (z2)! 
det |1 23 (x3)? +++ (#3)"*| = [[@-=,). 
a eh l iF 
l Tn (En)? sae (Cee 


(Here [[ stands for a multiple product, just like © stands for a multiple 
sum.) 
(a) No, a basis for R3 must have exactly three vectors. 


(b) We first extend the original vectors by the standard basis for R* and 
then try to eliminate two of them by considering 


1 4 1 0 0 0 
2 3 0 1 0 0 
al, +BY +71 69 +9] 4 +e 1 +11 9 =0. 
4 1 0 0 0 1 
So we study 
141000 1 4 1000 
230100 0 -5 -2 100 
320010 0 —10 -3 0 1 0 
410001 0 —15 —4 0 0 1 
1 0 -è —4 0 0 1 0 | 2 3 0 
01 2 100 010-2 -o 
aa 5 5 aa 5 5 
00 1 1010 001 10 1 0 
00 2 15 01 000 -š -10 $ 


From here we can keep row reducing to achieve RREF, but we can 
already see that the non-pivot variables will be € and 7. Hence we can 
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eject the last two vectors and obtain as our basis 


ad 


1 0 
0 1 
Oa 0 
0 0 


AeA UO Ne 
PFN Ww 


Of course, this answer is far from unique! 


The method is the same as above. Add the standard basis to {u,v} 
to obtain the linearly dependent set {u, v, €1,..., €n}. Then put these 
vectors as the columns of a matrix and row reduce. The standard 
basis vectors in columns corresponding to the non-pivot variables can 
be removed. 


à =i -l 
det} a-a =| = (0-5) pt3 aaa 
-1 -i À 


1 3 3 
= APS 5h e AAAA 5): 


Hence the eigenvalues are 0, —1, 3. 


When A = 0 we must solve the homogenous system 


0 
0 f~ 
0 


0 1 0 
O;J~] 0 1 
0 0 0 


—-1)|0 


= ve © 
NI= NIe N= 
O Nie =e 
NI= Ale Nie 
J i © 


1 
0 
0 
S 
So we find the eigenvector | —2s | where s Æ 0 is arbitrary. 
s 


For A= —1 
1 4 1/0 1 0 1ļ0 
i3 A 
1 4 1|0 0 0 00 
—s 
So we find the eigenvector 0 | where s Æ 0 is arbitrary. 
s 
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10. 


Finally, for À = 3 


1 1 3 
-5 5 ta E g 5 1 0 -1)0 
1 1 5 5 
1 3 5 5 
1 Ge 5 0 3 gs 0 00 0/0 
s 
So we find the eigenvector | s | where s Æ 0 is arbitrary. 
s 
1 
If the mistake X is in the direction of the eigenvector | —2 |, then Y = 0. 
1 


I.e., the satellite returns to the origin O. For all subsequent orbits it will 
again return to the origin. NASA would be very pleased in this case. 
—1 
If the mistake X is in the direction 0], then Y = —X. Hence the 
1 
satellite will move to the point opposite to X. After next orbit will move 
back to X. It will continue this wobbling motion indefinitely. Since this is a 
stable situation, again, the elite engineers will pat themselves on the back. 


1 
Finally, if the mistake X is in the direction | 1 | , the satellite will move to a 
1 
point Y = 3X which is further away from the origin. The same will happen 
for all subsequent orbits, with the satellite moving a factor 3/2 further away 
from O each orbit (in reality, after several orbits, the approximations used 
by the engineers in their calculations probably fail and a new computation 
will be needed). In this case, the satellite will be lost in outer space and the 
engineers will likely lose their jobs! 


1 0 0 
(a) A basis for B? is O},;1],{0 
0 0 1 
(b) 3. 
(c) 2? = 
(d) =. =3. 
(e) Because the vectors {v1, v2, v3} are a basis any element v € B? can be 
b! 
written uniquely as v = blui +b2v2 +b? v3 for some triplet of bits b2 
b3 


327 


328 


Sample Final Exam 


Hence, to compute L(v) we use linearity of L 


L(v) = L(blu, + b?v2 + b?v3) = b1L (v1) + OL (v2) + b°L(v3) 
= (L(v1) L(v2) L(v3)) b2 


From the notation of the previous part, we see that we can list linear 
transformations L : B? + B by writing out all possible bit-valued row 
vectors 


a N a ee E a 
© 

Fr OF OF Oo O 
=. 

Se as SS a 


There are 2° = 8 different linear transformations L : B? > B, exactly 
the same as the number of elements in B®. 


Yes, essentially just because Lı and Lə are linear transformations. In 
detail for any bits (a,b) and vectors (u,v) in B? it is easy to check the 
linearity property for (aL, + Lə) 


(aL, + BL2)(au + bv) = aly (au + bv) + BLe2(au + bv) 


= aal;(u) + abLı (v) + Bal, (u) + bL (v) 
= a(ali(u) + BLo(v)) + b(aLi(u) + 8L2(v)) 
= a(aLı + BL2)(u) + b(aL, + BL2)(v). 


Here the first line used the definition of (aL, + 6L2), the second line 
depended on the linearity of Lı and Le, the third line was just algebra 
and the fourth used the definition of (aL; + 6L2) again. 


Yes. The easiest way to see this is the identification above of these 
maps with bit-valued column vectors. In that notation, a basis is 


{(1 0 0),(0 1 0),(0 0 1). 
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11. 


Since this (spanning) set has three (linearly independent) elements, 
the vector space of linear maps B? —> B has dimension 3. This is an 
example of a general notion called the dual vector space. 


T ” a a 
a b | = —w? cos(wt) | b 
c 
Hence 
—a — b ap el 0 a 
F = cos(wt) | -a — 2b- c = cos(wt)|—1 -2 -1 b 
—b-c 0 -1 -l c 
a 
= —w coslwt) |b], 
c 
so 
m -l 0 
M=|-1 -2 -1 
0 -1 -1l 
A+1 1 0 
det{ 1 A+2 1 = (A4+1)(A42)(A41)-1) -(A4 1) 
0 1 A+1 


= (A+1)((A4+2)(A +1) — 2) 
= (A+ (A? 43d) =AA4F 1)(A43) 


so the eigenvalues are À = 0, —1, —3. 
For the eigenvectors, when A = 0 we study: 


-1 -1 0 1 1 0 1 0-1 
M-O7J={-1 -2 -1]~]70 -1 -1])~]0 1 1], 
0 -1 -1 0 -1 -i 0 0 
1 
so | —1 | is an eigenvector. 
1 
For Aà = —1 
0 -1 0 1 0 1 
M — (—1)I= |—-1 -1 -1]~]0 1 O], 
0 -1 0 0 0 0 
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—1 
so 0 | is an eigenvector. 
1 
For A = —3 
2-1 0 1 -l 1 1 0 -l 
M — (-3).J = | -1 1 -1l]~1{0 1 —2|~ |0 1 -2], 
0 -l 2 0 -1 2 00 0 
1 
so | 2 | is an eigenvector. 
1 


(c) The characteristic frequencies are 0, 1, V3. 
(d) The orthogonal change of basis matrix 


ed pelos tl 
V3 v2 V6 
P=|-+ 0 4 
co ae 
v3 V2 V6 
It obeys MP = PD where 
0 0 
D=|0 -1 0 
0 —3 
1 
(e) Yes, the direction given by the eigenvector | —1 | because its eigen- 
1 


value is zero. This is probably a bad design for a bridge because it can 
be displaced in this direction with no force! 


a b 
b d 


putting C = (<) yields XTC + CTX =2X-+C = 2cx + 2ey. Thus 


12. (a) If we call M = ( ), then XTMX = ax? + 2bry + dy?. Similarly 


0 = ax? + 2bry + dy? + 2cx + 2ey + f 


=e 05 G)+e aC) +6 9G) ++ 
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(b) 


(f) 


(g) 


Yes, the matrix M is symmetric, so it will have a basis of eigenvectors 
and is similar to a diagonal matrix of real eigenvalues. 

; ; — b 
To find the eigenvalues notice that det ( ‘ d- ] = (a — A)(d— 


A) - B= (à ag aye b? (E So the eigenvalues are 


_atd, a a—d\2 _at+d ~ a—d\,o2 
tea ae ideo Caer Ae ae A Pe 


The trick is to write 


XTMX+0TX+X"TC = (XT tC mM DM AEM  Oj=C' Mc, 
so that 
(XT +CTMHM(X + M!) =CTMC-f. 
Hence Y = X + M~!C and g = CTMC - f. 
The cosine of the angle between vectors V and W is given by 
VW | VW 
VV- VW:-W VVTIVWTW 


So replacing V > PV and W > PW will always give a factor PT P 
inside all the products, but PTP = I for orthogonal matrices. Hence 
none of the dot products in the above formula changes, so neither does 
the angle between V and W. 


If we take the eigenvectors of M, normalize them (i.e. divide them 
by their lengths), and put them in a matrix P (as columns) then P 
will be an orthogonal matrix. (If it happens that A = u, then we 
also need to make sure the eigenvectors spanning the two dimensional 
eigenspace corresponding to A are orthogonal.) Then, since M times 
the eigenvectors yields just the eigenvectors back again multiplied by 
their eigenvalues, it follows that MP = PD where D is the diagonal 
matrix made from eigenvalues. 


If Y = PZ, then Y'MY = Z?P™’MPZ = Z"PTPDZ = ZDZ 


where D = J 
O u 


Using part (f) and (c) we have 


à? + uw? =g. 
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(h) When à = u and g/\ = R?, we get the equation for a circle radius R in 
the (z, w)-plane. When à, u and g are postive, we have the equation for 
an ellipse. Vanishing g along with A and u of opposite signs gives a pair 
of straight lines. When g is non-vanishing, but À and u have opposite 
signs, the result is a pair of hyperbole. These shapes all come from 
cutting a cone with a plane, and are therefore called conic sections. 


13. We show that L is bijective if and only if M is invertible. 


(a) We suppose that L is bijective. 


i. 


iii. 


Since L is injective, its kernel consists of the zero vector alone. 
Hence 
L = dimker L = 0. 


So by the Dimension Formula, 

dim V = L + rank L = rank L. 
Since L is surjective, L(V) = W. Thus 

rank L = dim L(V) = dim W. 


Thereby 
dim V = rank L = dim W. 


. Since dim V = dim W, the matrix M is square so we can talk 


about its eigenvalues. Since L is injective, its kernel is the zero 
vector alone. That is, the only solution to LX = 0 is X = Oy. 
But LX is the same as MX, so the only solution to MX = 0 is 
X = 0y. So M does not have zero as an eigenvalue. 

Since MX = 0 has no non-zero solutions, the matrix M is invert- 
ible. 


(b) Now we suppose that M is an invertible matrix. 


i. 


iii. 


Since M is invertible, the system MX = 0 has no non-zero solu- 
tions. But LX is the same as M X, so the only solution to LX = 0 
is X = Oy. So L does not have zero as an eigenvalue. 


. Since LX = 0 has no non-zero solutions, the kernel of L is the 


zero vector alone. So L is injective. 


Since M is invertible, we must have that dim V = dim W. By the 
Dimension Formula, we have 


dim V = L + rank L 
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and since ker L = {0y } we have L = dim ker L = 0, so 
dim W = dim V = rank L = dim L(V). 


Since L(V) is a subspace of W with the same dimension as W, it 
must be equal to W. To see why, pick a basis B of L(V). Each 
element of B is a vector in W, so the elements of B form a linearly 
independent set in W. Therefore B is a basis of W, since the size 
of B is equal to dim W. So L(V) = span B = W. So L is surjective. 


(a) F, = Fo + F3 =24+3=5. 


(b) The number of pairs of doves in any given year equals the number of 
the previous years plus those that hatch and there are as many of them 
as pairs of doves in the year before the previous year. 


an(n) -C 
w(t 9O- -2 


(d) We just need to use the recursion relationship of part (b) in the top 
slot of Xn41: 


_ Frat _ Fn + Fa-1 — 1 1 Fn — 
on (CRG aA 


(e) Notice M is symmetric so this is guaranteed to work. 


1-rXA 1 l2 5 


so the eigenvalues are eaves Hence the eigenvectors are 


z 
eee 


E 
1 
oo 


respectively (notice that EVE H1 = Se EVE VE 
1-V5 1-45), Thus M = PDP- 1 with 


1+v5 0 1+v5 1-V5 
2 2 2 
D= and P = : 
0 4 1 1 


(£) M” =(PDP-!)" = PDP-!PDP-}...PDP-! = PD"P-!. 


and 3 
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(g) Just use the matrix recursion relation of part (d) repeatedly: 


Xn41 = MXn = M?Xn_-1 = +++ = M’ Xi. 


(h) The eigenvalues are y = is and 1- y= 15, 


n 


Hence 
A ae. 
These are the famous Fibonacci numbers. 


15. Call the three vectors u,v and w, respectively. Then 


1 

4 

L u 3 si 

D=) u=v u = na E 

we Z 

4 

1 

4 

and 
—1 
L u. w view | 3 can 0 
w~“ =w u oy = we fugu = 

Uru v+ e.v 4 A 0 
1 


Dividing by lengths, an orthonormal basis for span{u, v, w} is 


1 v3 

za a Ae 
1 

AN 0 
1p” 3 | 0 
2 6 

1 v3 v2 
2 6 2 


16. (a) The columns of M span imL in the basis given for W. 
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(b) The rows of M span (kerL)+ 
(c) First we put M in RREF: 


De <8 i 2 1 33 

21-1 2 0 -3 -3 —4 
e o ee a ae N a ee 

4 1-1 0 0 -7 -5 -12 
10-1 § 100 -1 
01 1 ¢ 010 8 
00 1 -5 001 -$ 
00 2 -$ 000 0 

Hence 8 4 
ker L = span{vı — 392 + 373 + va} 

and 


imL = span{v} + 2v2 + v3 + 4v4, 20, + v2 + V4, 01 — V2 — ua} ; 
Thus dim ker L = 1 and dimimL = 3 so 


dim ker L + dimiml = 1 + 3 = 4 = dim V. 


17. (a) 
5 = 4a —2c+c 
2=a-—-b+c 
0=a+b+c 
3 = 4a +2b+c. 
(b,c,d) 
4 —2 1/5 1 1 110 1 0 1|—1 
1 —1 1/2 0 -6 -315 0 1 0 1 
1 110 0 -2 02 0 0 -3| 11 
4 2 113 0 —2 -33 0 0 -8 
The system has no solutions because c = —1 and c = -4 is impossible. 
(e) Let 
4 —2 1 5 
1 —1 1 2 
M= i ant and V = 0 
4 21 3 
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Then 
34 0 10 34 
MTM=| 0 10 0| and MTV = | —6 
10 0 4 10 
So 
34 0 10| 34 1 0 jı 1 0 0| 1 
0 10 20.) S64. ~{0 10 0|/-6] ~|0 1 0|- 
10 0 4] 10 0 0 -#| 0 001] 0 
The least squares solution is a = 1, b=-3 and c= 0. 


: _ 7192 _ 3 — 14 
(b) The Captain predicts y(2) = 1.24 — #.2+0 = =. 
18. We show that L is bijective if and only if M is invertible. 


(a) We suppose that L is bijective. 


i. Since L is injective, its kernel consists of the zero vector alone. So 
L = dimker L = 0. 
By the dimension formula, 
dim V = L + rank L = rank L. 
Since L is surjective, L(V) = W. So 
rank L = dim L(V) = dim W. 
So 
dim V = rank L = dim W. 


ii. Since dim V = dim W, the matrix M is square so we can talk 
about its eigenvalues. Since L is injective, its kernel is the zero 
vector alone. That is, the only solution to LX = 0 is X = Oy. 
But LX is the same as MX, so the only solution to MX = 0 is 
X = Oy. So M does not have zero as an eigenvalue. 

iii. Since MX = 0 has no non-zero solutions, the matrix M is invert- 
ible. 


(b) Now we suppose that M is an invertible matrix. 


i. Since M is invertible, the system MX = 0 has no non-zero solu- 
tions. But LX is the same as M X, so the only solution to LX = 0 
is X = Oy. So L does not have zero as an eigenvalue. 
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iii. 


Since LX = 0 has no non-zero solutions, the kernel of L is the 
zero vector alone. So L is injective. 


Since M is invertible, we must have that dim V = dim W. By the 
Dimension Formula, we have 


dim V = L + rank L 
and since ker L = {Ov} we have L = dim ker L = 0, so 
dim W = dim V = rank L = dim L(V). 


Since L(V) is a subspace of W with the same dimension as W, it 
must be equal to W. To see why, pick a basis B of L(V). Each 
element of B is a vector in W, so the elements of B form a linearly 
independent set in W. Therefore B is a basis of W, since the size 
of B is equal to dim W. So L(V) = span B = W. So L is surjective. 
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G.1 What is Linear Algebra? 


Hint for Review Problem 5 


Looking at the problem statement we find some important information, first 
that oranges always have twice as much sugar as apples, and second that the 
information about the barrel is recorded as (s, f), where s = units of sugar in 
the barrel and f = number of pieces of fruit in the barrel. 


We are asked to find a linear transformation relating this new representa- 
tion to the one in the lecture, where in the lecture x = the number of apples 
and y = the number of oranges. This means we must create a system of equa- 
tions relating the variable x and y to the variables s and f in matrix form. 
Your answer should be the matrix that transforms one set of variables into the 
other. 


Hint: Let À represent the amount of sugar in each apple. 


1. To find the first equation relate f to the variables x and y. 


2. To find the second equation, use the hint to figure out how much sugar 
is in x apples, and y oranges in terms of À. Then write an equation for s 
using x, y and À. 
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G.2 Systems of Linear Equations 


Augmented Matrix Notation 


Why is the augmented matrix 


equivalent to the system of equations 


ty = 27 
2r—-y = 0? 


Well the augmented matrix is just a new notation for the matrix equation 


é 30-6 


and if you review your matrix multiplication remember that 


G =) G) = (a2) 
(t= C0) 


which is our original equation. 


This means that 


Equivalence of Augmented Matrices 
Lets think about what it means for the two augmented matrices 
1 1 | 27 and 1 0; 9 
2 -1/) 0 0 1/18 
to be equivalent: They are certainly not equal, because they don’t match in 
each component, but since these augmented matrices represent a system, we 


might want to introduce a new kind of equivalence relation. 
Well we could look at the system of linear equations this represents 


tty = 2 
2r—-y = 0 


and notice that the solution is x = 9 and y= 18. The other augmented matrix 
represents the system 


z+0-y = 9 
0z + y = 18 
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This clearly has the same solution. The first and second system are related 
in the sense that their solutions are the same. Notice that it is really 
nice to have the augmented matrix in the second form, because the matrix 
multiplication can be done in your head. 


Hints for Review Question 10 


This question looks harder than it actually is: 


Row equivalence of matrices is an example of an equivalence 
relation. Recall that a relation ~ on a set of objects U 
is an equivalence relation if the following three properties 
are satisfied: 


e Reflexive: For any x€U, we have r~r. 
e Symmetric: For any xz,y EU, if r~y then y~r. 
e Transitive: For any x,y and z EU, if r~y andy~z 


then t~ z. 


(For a more complete discussion of equivalence relations, see 
Webwork Homework 0, Problem 4) 


Show that row equivalence of augmented matrices is an equivalence 


relation. 


Firstly remember that an equivalence relation is just a more general ver- 
sion of ‘‘ 
whose linear systems have solutions by the property that their solutions are 
the same. 


So this question is really about the word same. Lets do a silly example: 


equals’’. Here we defined row equivalence for augmented matrices 


Lets replace the set of augmented matrices by the set of people who have hair. 
We will call two people equivalent if they have the same hair color. There are 
three properties to check: 


e Reflexive: This just requires that you have the same hair color as 
yourself so obviously holds. 


e Symmetric: If the first person, Bob (say) has the same hair color as a 
second person Betty(say), then Bob has the same hair color as Betty, so 
this holds too. 
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Poy 


e Transitive: If Bob has the same hair color as Betty (say) and Betty has 
the same color as Brenda (say), then it follows that Bob and Brenda have 
the same hair color, so the transitive property holds too and we are 


= ape 
>- 
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Solution set in set notation 


Here is an augmented matrix, let’s think about what the solution set looks 
like 
1 0 3)2 
( 0 1 al 1 ) 


l- xi + 3273 = 2 
1-29 = i: 


This looks like the system 


Notice that when the system is written this way the copy of the 2x 2 identity 


matrix ( 01 ) makes it easy to write a solution in terms of the variables 


3 
0 
does not look like part of an identity matrix, and there is no 3 x 3 identity 
in the augmented matrix. Notice there are more variables than equations and 
that this means we will have to write the solutions for the system in terms of 
the variable x3. We’11 call z3 the free variable. 

Let 73 =p. (We could also just add a ‘‘dummy’’ equation £3 = 73.) Then we 
can rewrite the first equation in our system 


zı and z2. We will call x, and x the pivot variables. The third column 


Tı + 323 = 2 
zı +3 = 2 
2 


Ly = 
Then since the second equation doesn’t depend on u we can keep the equation 
T2 = 1, 


and for a third equation we can write 


v3 =H 
so that we get the system 
Ti 2— 3u 
v2 = 1 
T3 H 
2 —3u 
= 1) + 0 
0 H 
2 —3 
= 1}/+p{ 0 
0 1 
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Any value of u will give a solution of the system, and any system can be written 
in this form for some value of u. Since there are multiple solutions, we can 
also express them as a set: 


Tı 2 —3 
t)=]1)t+yu O)])uweER 
T3 0 1 


Worked Examples of Gaussian Elimination 


Let us consider that we are given two systems of equations that give rise to 
the following two (augmented) matrices: 


2 520 2 5 2 9 
1 1 1 0 1 0 5 10 
1 4 1 0 1 0 3 6 


and we want to find the solution to those systems. We will do so by doing 
Gaussian elimination. 
For the first matrix we have 


2520/|2 1110]1 
1110| a Pe |e 5202 
14101 ie eee a 
sep ate tte le Ee || 
eae age Age D0 A, 
0300/0 

ae ee 

Rr (Ge LO Or ||.20 

0300/0 
E E a S E 
anag o i oojo 
0000/0 


1. We begin by interchanging the first two rows in order to get a 1 in the 
upper-left hand corner and avoiding dealing with fractions. 


2. Next we subtract row 1 from row 3 and twice from row 2 to get zeros in the 
left-most column. 


Then we scale row 2 to have a 1 in the eventual pivot. 


4. Finally we subtract row 2 from row 1 and three times from row 2 to get it 
into Reduced Row Echelon Form. 


Therefore we can write x =1— A, y=0, z=X and w= pu, or in vector form 


(a a e E ee S 
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Now for the second system we have 


5 2 DV gn fo E. 
0 5 10| l0 1] 2 
0 3 6 03/6 
5 2 | 9 
eee a || 2 
00] 0 
5 0] 5 
Bee i2 
001} 0 
( 6: |) 4 

1 
eo te: i || a 
00] 0 


We scale the second and third rows appropriately in order to avoid fractions, 
then subtract the corresponding rows as before. Finally scale the first row 
and hence we have x= 1 and y= 2 as a unique solution. 


Hints for Review Question 10 


This question looks harder than it actually is: 


Row equivalence of matrices is an example of an equivalence 
relation. Recall that a relation ~ on a set of objects U 
is an equivalence relation if the following three properties 
are satisfied: 


e Reflexive: For any x€U, we have r~r. 
e Symmetric: For any xz,y EU, if r~y then y~r. 


e Transitive: For any x,y and z EU, if r~y andy~z 
then t~ Zz. 


(For a more complete discussion of equivalence relations, see 
Webwork Homework 0, Problem 4) 


Show that row equivalence of augmented matrices is an equivalence 
relation. 


Firstly remember that an equivalence relation is just a more general ver- 
sion of ‘‘equals’’. Here we defined row equivalence for augmented matrices 
whose linear systems have solutions by the property that their solutions are 
the same. 

So this question is really about the word same. Lets do a silly example: 
Lets replace the set of augmented matrices by the set of people who have hair. 
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We will call two people equivalent if they have the same hair color. There are 
three properties to check: 


@ Reflexive: This just requires that you have the same hair color as 
yourself so obviously holds. 


e Symmetric: If the first person, Bob (say) has the same hair color as a 
second person Betty(say), then Bob has the same hair color as Betty, so 


this holds too. 
me > gh 


e Transitive: If Bob has the same hair color as Betty (say) and Betty has 
the same color as Brenda (say), then it follows that Bob and Brenda have 
the same hair color, so the transitive property holds too and we are 


= ape 
ae? 
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Hint for Review Question 5 


The first part for Review Question 5 is simple--just write out the associated 
linear system and you will find the equation 0 = 6 which is inconsistent. 
Therefore we learn that we must avoid a row of zeros preceding a non-vanishing 
entry after the vertical bar. 

Turning to the system of equations, we first write out the augmented matrix 
and then perform two row operations 


2 k 3-k} 1 
Ro—R1;R3—2R = i $ 
< aa ae 1 0 3 3 —9 


0 k+6 3-—k)-11 


Next we would like to subtract some amount of Rə from R3 to achieve a zero in 
the third entry of the second column. But if 


k+6=3-k>k=-, 


this would produce zeros in the third row before the vertical line. You should 
also check that this does not make the whole third line zero. You now have 
enough information to write a complete solution. 


Planes 


Here we want to describe the mathematics of planes in space. The video is 
summarised by the following picture: 


The plane N —t Norma! 
N.V =d Vector 


Pact iculor s~ Və 


4 Hamog ene oug 
solution 


solu fion 


A plane is often called R? because it is spanned by two coordinates, and space 
is called R and has three coordinates, usually called (x,y,z). The equation 
for a plane is 

ax + by+cz=d. 
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Lets simplify this by calling V = (x,y,z) the vector of unknowns and N = 
(a,b,c). Using the dot product in R? we have 


N:V=d. 


Remember that when vectors are perpendicular their dot products vanish. Ze. 
U-V=0<U LV. This means that if a vector Vo solves our equation N°V =d, 
then so too does Vo + C whenever C is perpendicular to N. This is because 


N- (tC) =N*Y+N:CH=d+0=d. 


But C is ANY vector perpendicular to N, so all the possibilities for C span 
a plane whose normal vector is N. Hence we have shown that solutions to the 
equation ax + by + cz = 0 are a plane with normal vector N = (a,b,c). 


Pictures and Explanation 


This video considers solutions sets for linear systems with three unknowns. 
These are often called (x,y,z) and label points in R. Lets work case by case: 


e If you have no equations at all, then any (x,y,z) is a solution, so the 
solution set is all of R. The picture looks a little silly: 


a `y 


e For a single equation, the solution is a plane. This is explained in 
this video or the accompanying script. The picture looks like this: 


— j 


e For two equations, we must look at two planes. These usually intersect 
along a line, so the solution set will also (usually) be a line: 
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—y 


e For three equations, most often their intersection will be a single 
point so the solution will then be unique: 


z 


e Of course stuff can go wrong. Two different looking equations could 
determine the same plane, or worse equations could be inconsistent. If 
the equations are inconsistent, there will be no solutions at all. For 
example, if you had four equations determining four parallel planes the 
solution set would be empty. This looks like this: 


K- 


G.3 Vectors in Space n-Vectors 


Review of Parametric Notation 


The equation for a plane in three variables xz, y and z looks like 
ax +by+cz =d 
where a, b, c, and d are constants. Lets look at the example 


zr +2y+5z=3. 
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In fact this is a system of linear equations whose solutions form a plane with 
normal vector (1,2,5). As an augmented matrix the system is simply 


(1 2 5 | 3). 


This is actually RREF! So we can let x be our pivot variable and y, z be 
represented by free parameters À and A2: 


t= Ay 5 y= A2Q i 
Thus we write the solution as 
GC = 21 5A2 t 3 
y= mM 
zZ = À2 
or in vector notation 
x 3 —2 —5 
Z 0 0 1 


This describes a plane parametric equation. Planes are ‘‘two-dimensional’’ 
because they are described by two free variables. Here’s a picture of the 
resulting plane: 


The Story of Your Life 


This video talks about the weird notion of a ‘‘length-squared’’ for a vector 
v = (x,t) given by ||v||? = z? — t? used in Einstein’s theory of relativity. The 
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idea is to plot the story of your life on a plane with coordinates (x,t). The 
coordinate x encodes where an event happened (for real life situations, we 
must replace x > (x,y,z) € R). The coordinate t says when events happened. 
Therefore you can plot your life history as a worldline as shown: 


t Ze a ne 


Each point on the worldline corresponds to a place and time of an event in your 
life. The slope of the worldline has to do with your speed. Or to be precise, 
the inverse slope is your velocity. Einstein realized that the maximum speed 
possible was that of light, often called c. In the diagram above c = 1 and 
corresponds to the lines x = xt > xr? — t? =0. This should get you started in 
your search for vectors with zero length. 


G.4 Vector Spaces 


Examples of Each Rule 


Lets show that R? is a vector space. To do this (unless we invent some clever 
tricks) we will have to check all parts of the definition. Its worth doing 
this once, so here we go: 

Before we start, remember that for R? we define vector addition and scalar 
multiplication component-wise. 


(+i) Additive closure: We need to make sure that when we add (7) and o 
2 2 


that we do not get something outside the original vector space R?. This 
just relies on the underlying structure of real numbers whose sums are 
again real numbers so, using our component-wise addition law we have 


£ zi tz 
1) Ege Ee | ER. 
T2 y2 Yı T Y2 
(+ii) Additive commutativity: We want to check that when we add any two vectors 
we can do so in either order, i.e. 


Se uC 
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(+iii) 


(iv) 


(+v) 


This again relies on the underlying real numbers which for any x,y € R 
obey 


r+y=yte. 


This fact underlies the middle step of the following computation 


yo ote le tes eo 


which demonstrates what we wished to show. 


Additive Associativity: This shows that we needn’t specify with paren- 
theses which order we intend to add triples of vectors because their 
sums will agree for either choice. What we have to check is 


(e a) +) 2G) +() +) 
T2 y2 22 T2 y2 22 
Again this relies on the underlying associativity of real numbers: 


(@t+y)+z2=a+(y+z2). 


The computation required is 
zı + yı + REA T TEY + OU Vi nos (zı ty) + 21 
T2 y2 22 T2 T Y2 22 (£2 + yo) + 22 
oe oa (yi + 21) a (eae j YLEN _ Ti J yı Re Z1 , 
T2 + (y2 + 22) yı Yo + 22 T2 y2 z2 
Zero: There needs to exist a vector 0 that works the way we would expect 


zero to behave, i.e. 
1) 45=(*1). 
yı yı 
It is easy to find, the answer is 
> 0 
5=(j). 


You can easily check that when this vector is added to any vector, the 
result is unchanged. 


Tı 
T2 
another vector that can be added to it so the sum is 0. (Note that it 
is important to first figure out what 0 is here!) The answer for the 


PEIR: H Tı y =i 
additive inverse of ( ) is ( ) because 


Additive Inverse: We need to check that when we have ( k there is 


T2 — T2 


a E a 
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We are half-way done, now we need to consider the rules for scalar multipli- 


cation. Notice, that we multiply vectors by scalars (i.e. numbers) but do NOT 


multiply a vectors by vectors. 


Ci) 


Cii) 


(-iii) 


Multiplicative closure: Again, we are checking that an operation does 
not produce vectors outside the vector space. For a scalar a € R, we 


T 
require that a o) lies in R?. First we compute using our component- 
2 


wise rule for scalars times vectors: 


Ly atı 
a = : 
T2 ax2 
Since products of real numbers az; and azz are again real numbers we see 


this is indeed inside R?. 


Multiplicative distributivity: The equation we need to check is 


(at) i) te a Hi (o l 


Once again this is a simple LHS=RHS proof using properties of the real 
numbers. Starting on the left we have 


e E TRE 


as required. 


Additive distributivity: This time we need to check the equation The 
equation we need to check is 


LORO +e(a): 
T2 y2 T2 y2 
i.e., one scalar but two different vectors. The method is by now becoming 
familiar 
à Tı + yı Be is oe = a(x; + y1) 
T2 y2 T2 + Y2 a(x2 + y2) 
=(C25) = (C2) +) a 
a£ T ay2 arg ay2 T2 Y2 


again as required. 
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(Civ) Multiplicative associativity. Just as for addition, this is the re- 
quirement that the order of bracketing does not matter. We need to 


establish whether 
T2 T2 


This clearly holds for real numbers a.(b.x) = (a.b).x. The computation is 


ea ae a a a e: 
which is what we want. 


Cv) Unity: We need to find a special scalar acts the way we would expect 
‘1°? to behave. Ie. 


There is an obvious choice for this special scalar---just the real number 
1 itself. Indeed, to be pedantic lets calculate 


1 Ly a L.xy _ fe 
T2 E 1.£2 E T2 : 
Now we are done---we have really proven the R? is a vector space so lets write 
a little square to celebrate. 


Example of a Vector Space 


This video talks about the definition of a vector space. Even though the 
defintion looks long, complicated and abstract, it is actually designed to 
model a very wide range of real life situations. As an example, consider the 
vector space 

V = {all possible ways to hit a hockey puck}. 


The different ways of hitting a hockey puck can all be considered as vectors. 
You can think about adding vectors by having two players hitting the puck at 
the same time. This picture shows vectors N and J corresponding to the ways 
Nicole Darwitz and Jenny Potter hit a hockey puck, plus the vector obtained 
when they hit the puck together. 


Nt+7J 
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You can also model the new vector 2J obtained by scalar multiplication by 
2 by thinking about Jenny hitting the puck twice (or a world with two Jenny 
Potters....). Now ask yourself questions like whether the multiplicative 
distributive law 


2J +2N =2(J +N) 


make sense in this context. 


Hint for Review Question 5 


Lets worry about the last part of the problem. The problem can be solved 
by considering a non-zero simple polynomial, such as a degree 0 polynomial, 
and multiplying by 7 € C. That is to say we take a vector p € PE and then 
considering i-p. This will violate one of the vector space rules about scalars, 
and you should take from this that the scalar field matters. 

As a second hint, consider Q (the field of rational numbers). This is not 
a vector space over R since V2. 1 = V2 ¢ Q, so it is not closed under scalar 
multiplication, but it is clearly a vector space over Q. 


G.5 Linear Transformations 


Hint for Review Question 5 


The first thing we see in the problem is a definition of this new space P,. 
Elements of P, are polynomials that look like 


ao + ayt + aot? +... + ant” 


where the a;’s are constants. So this means if L is a linear transformation 
from P — P; that the inputs of L are degree two polynomials which look like 


ao + aıt + ast? 
and the output will have degree three and look like 
bo + bit + bat? + bgt? 


We also know that L is a linear transformation, so what does that mean in 
this case? Well, by linearity we know that we can separate out the sum, and 
pull out the constants so we get 


L(ao + aıt + azt?) = aoL(1) + a, L(t) + az L(t?) 


Just this should be really helpful for the first two parts of the problem. The 
third part of the problem is asking us to think about this as a linear algebra 
problem, so lets think about how we could write this in the vector notation we 
use in the class. We could write 
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ao tayt+ aot? as | a, 


And think for a second about how you add polynomials, you match up terms of 
the same degree and add the constants component-wise. So it makes some sense 
to think about polynomials this way, since vector addition is also component- 
wise. 

We could also write the output 


bo + byt + bot? + b3t? as by b3 


Then lets look at the information given in the problem and think about it 
in terms of column vectors 


e L(1) = 4 but we can think of the input 1 = 1+ 0t + 0t? and the output 


1 
4=4+0t+0t?0t? and write this as L({0])= 
0 


Oooo 


0 

n 0 

e L(t) =t? This can be written as L({1])= 0 
0 

1 


e L(t?) =t—1 It might be a little trickier to figure out how to write 
t— 1 but if we write the polynomial out with the terms in order and with 
zeroes next to the terms that do not appear, we can see that 


t-1=-1+t+0t? +0t? corresponds to 


oor Fe 


So this can be written as L( 


e. O © 


—1 
1 
Is 0 
0 


Now to think about how you would write the linear transformation L as 
a matrix, first think about what the dimensions of the matrix would be. 
Then look at the first two parts of this problem to help you figure out 
what the entries should be. 
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G.6 Matrices 


Adjacency Matrix Example 


Lets think about a graph as a mini-facebook. In this tiny facebook there are 
only four people, Alice, Bob, Carl, and David. 
Suppose we have the following relationships 


e Alice and Bob are friends. 
e Alice and Carl are friends. 
e Carl and Bob are friends. 


èe David and Bob are friends. 


David 


Carl 


Nye 


Alice 


Now draw a picture where each person is a dot, and then draw a line between 
the dots of people who are friends. This is an example of a graph if you think 
of the people as nodes, and the friendships as edges. 

Now lets make a 4x 4 matrix, which is an adjacency matrix for the graph. 
Make a column and a row for each of the four people. It will look a lot like a 
table. When two people are friends put a 1 the the row of one and the column 
of the other. For example Alice and Carl are friends so we can label the table 
below. 


A B C D 
1 


J awe 


We can continue to label the entries for each friendship. Here lets assume 
that people are friends with themselves, so the diagonal will be all ones. 
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9VQAW PSY 
OrRrREH/| PS 
PrPRe eI w 
OrRPRrFREFEIQ 
FOF O;UD 


Then take the entries of this table as a matrix 


1 1 1 0 
1 1 1 1 
1 1 1 0 
0 1 0 1 


Notice that this table is symmetric across the diagonal, the same way a 
multiplication table would be symmetric. This is because on facebook friend- 
ship is symmetric in the sense that you can’t be friends with someone if they 
aren’t friends with you too. This is an example of a symmetric matrix. 

You could think about what you would have to do differently to draw a graph 
for something like twitter where you don’t have to follow everyone who follows 
you. The adjacency matrix might not be symmetric then. 


Do Matrices Commute? 


This video shows you a funny property of matrices. Some matrix properties 
look just like those for numbers. For example numbers obey 


a(bc) = (ab)c 


and so do matrices: 


A(BC) = (AB)C. 


This says the order of bracketing does not matter and is called associativity. 
Now we ask ourselves whether the basic property of numbers 


ab = ba, 


holds for matrices 


AB= BA. 


For this, firstly note that we need to work with square matrices even for both 
orderings to even make sense. Lets take a simple 2 x 2 example, let 


se). a0). oC 9) 


In fact, computing AB and BA we get the same result 


O _f1 a+b 
an= na= (1 +A, 
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so this pair of matrices do commute. Lets try A and C: 


= l+a a o f1 a 
ac=( A is and c= (; mg) 


AC #CA 


and this pair of matrices does not commute. Generally, matrices usually do not 
commute, and the problem of finding those that do is a very interesting one. 


So 


Matrix Exponential Example 


This video shows you how to compute 


l 0 0 
exP |o o]: 


For this we need to remember that the matrix exponential is defined by its 
power series 
1 1 
EPM TEM A M M 
Now lets call 


where the matrix 


and by matrix multiplication is seen to obey 


È =i, it =]. 


i? = -I 


Using these facts we compute by organizing terms according to whether they 
have an 7 or not: 


1 1 
expi?d = Tt Od) Teh 


= Icosé+isin@ 


B cos sind 

~ \=sin@ cosd} ` 
Here we used the familiar Taylor series for the cosine and sine functions. A 
fun thing to think about is how the above matrix acts on vector in the plane. 
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Proof Explanation 


In this video we will talk through the steps required to prove 
tr MN =tr NM. 
There are some useful things to remember, first we can write 


M = (m5) and N= (n$) 


where the upper index labels rows and the lower one columns. Then 
MN =(X_min}), 
l 


where the ‘‘open’’ indices i and j label rows and columns, but the index l is 
a ‘‘dummy’’ index because it is summed over. (We could have given it any name 
we liked!). 
Finally the trace is the sum over diagonal entries for which the row and 
column numbers must coincide 
tr M = 5 mi. 
a 


Hence starting from the left of the statement we want to prove, we have 
LHS = tr MN = S25 mini. 
a ol 


Next we do something obvious, just change the order of the entries mi and ni 
(they are just numbers) so 


an 4 
Emin EE nim. 
iol aod 
Equally obvious, we now rename i — l and l —> i so 
ind __ iad 
EEn EE nim. 
ici Looi 
Finally, since we have finite sums it is legal to change the order of summa- 
tions 
ipl — inl 
D Zrim = OT nimi. 
14 i l 
This expression is the same as the one on the line above where we started 
except the m and n have been swapped so 


XX min; =tr NM = RES. 
l 


i 


This completes the proof. 
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Hint for Review Question 4 


This problem just amounts to remembering that the dot product of x = (%1,%2,...,Un) 


and y = (Yi, Y2,---,Yn) is 
LY + T2Y2 +++ + EnYn - 


Then try multiplying the above row vector times yT and compare. 


Hint for Review Question 5 


The majority of the problem comes down to showing that matrices are right 
distributive. Let Mķę is all n x k matrices for any n, and define the map 
fr: Mk > Mm by frR(M) = MR where R is some k x m matrix. It should be 
clear that fr(a: M) = (aM)R = a(MR) = afr(M) for any scalar a. Now all 
that needs to be proved is that 


fr(M+N)=(M+N)R=MR+NR= fr(M) + fr(N), 


and you can show this by looking at each entry. 

We can actually generalize the concept of this problem. Let V be some 
vector space and M be some collection of matrices, and we say that M isa 
Left-action on V if 

(M-N)ov=Mo(Nov) 


for all M,N € N and v € V where - denoted multiplication in M (i.e. standard 
matrix multiplication) and o denotes the matrix is a linear map on a vector 
(i.e. M(v)). There is a corresponding notion of a right action where 


vo(M-N)=(voM)oN 


where we treat vo M as M(v) as before, and note the order in which the 
matrices are applied. People will often omit the left or right because they 
are essentially the same, and just say that M acts on V. 


Hint for Review Question 8 


This is a hint for computing exponents of matrices. So what is e4 if Aisa 
matrix? We remember that the Taylor series for 


Co 
T x” 
e = > —. 
n! 

n=0 

So as matrices we can think about 

co 
A A” 
e = — 
n! 

n=0 
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This means we are going to have an idea of what A” looks like for any n. Lets 
look at the example of one of the matrices in the problem. Let 


s2(3 4) 


Lets compute A” for the first few n. 


There is a pattern here which is that 
1 nà 
A” = 
(a 1 i 
then we can think about the first few terms of the sequence 
Fe iene tty eter ree ee 
n! 2! 3! a 


Looking at the entries when we add this we get that the upper left-most entry 
looks like this: 


Ls ii ae ae 
ae a aes ee 


Continue this process with each of the entries using what you know about Taylor 
series expansions to find the sum of each entry. 


2 x 2 Example 


Lets go though and show how this 2x2 example satisfies all of these properties. 


Lets look at 
7 3 
HEN A 


We have a rule to compute the inverse 


a b\*t 1 d —b 
c d ~ ad—be\ —c a 
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So this means that 


1 5 -3 
M! = —— 
35 — 33 ( -11 7 ) 


Lets check that M7!M =I = MMH. 


1 e, aiya s 172 0 
=i a = = = 
o M= som (hy Ta 4 AG s) i 


You can compute MM~—!, this should work the other way too. 
Now lets think about products of matrices 


1 3 1 0 
Let A= ( i5 ) and B = ( 2 1 ) 


Notice that M = AB. We have a rule which says that (AB)! = BIAL. 
Lets check to see if this works 


and 


mia ( i (A F) 
Hint for Review Problem 3 


Firstnote that (b) implies (a) is the easy direction: just think about what it 
means for M to be non-singular and for a linear function to be well-defined. 
Therefore we assume that M is singular which implies that there exists a non- 
zero vector Xo such that MXo = 0. Now assume there exists some vector Xy 
such that MXy = V, and look at what happens to Xy +c- Xo for any c in your 
field. Lastly don’t forget to address what happens if Xy does not exist. 


Hint for Review Question 4 


In the text, only inverses for square matrices were discussed, but there is a 
notion of left and right inverses for matrices that are not square. It helps 
to look at an example with bits to see why. To start with we look at vector 
spaces 


Z3 = {(z,y, z)|£, y, z = 0,1} and Zs = {(z,y)|2,y = 0,1}. 


These have 8 and 4 vectors, respectively, that can be depicted as corners of 
a cube or square: 


363 


364 


Movie Scripts 


(°,") e 


Now lets consider a linear transformation 
DEZ —+ Z3: 


This must be represented by a matrix, and lets take the example 


x x 
0 1 1 
Lily =( 1 i) y| := AX. 
z z 


Since we have bits, we can work out what L does to every vector, this is listed 
below 


0,0,0) “ (0,0) 
0,0,1) 5 (1,0) 
1,1,0) Š (1,0) 
1,0,0) = (0,1) 
0,1,1) 4 (0,1) 
0,1,0) 5 (1,1) 
1,0,1) 5 (1,1) 
1,1,1) Š (1,1) 


Now lets think about left and right inverses. A left inverse B to the matrix 
A would obey 
BA=I 


and since the identity matrix is square, B must be 2x3. It would have to 
undo the action of A and return vectors in z3 to where they started from. But 
above, we see that different vectors in z3 are mapped to the same vector in Ze 
by the linear transformation L with matrix A. So B cannot exist. However a 
right inverse C obeying 


AC =I 


can. It would be 2x2. Its job is to take a vector in Z back to one in Z3 ina 
way that gets undone by the action of A. This can be done, but not uniquely. 
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Using an LU Decomposition 


Lets go through how to use a LU decomposition to speed up solving a system of 
equations. Suppose you want to solve for x in the equation Mx = b 


1 0 -5 6 
3 —1 -14 |x=]|19 
1 0 -8 4 


where you are given the decomposition of M into the product of L and U which 
are lower and upper and lower triangular matrices respectively. 


1 0 —5 1 0 0 1 0 —5 
M=y; 3 -1 -14 ]=|3 1 0 0 -1 1 = LU 
1 0 -83 1 0 2 0 0 1 


First you should solve L(Ux) =b for Ux. The augmented matrix you would use 
looks like this 


1 0 0| 6 
3 1 0/19 
1 0 2| 4 


This is an easy augmented matrix to solve because it is upper triangular. If 
you were to write out the three equations using variables, you would find that 
the first equation has already been solved, and is ready to be plugged into 
the second equation. This backward substitution makes solving the system much 
faster. Try it and in a few steps you should be able to get 


1 0 0| 6 
0 1 0j 1 
0 0 1/;-1 
6 
This tells us that Ur = 1 |. Now the second part of the problem is to solve 
—1 
for x. The augmented matrix you get is 
1 0 —5)] 6 
0 -1 1 1 


It should take only a few step to transform it into 


1 0 0| 1 
01 0-2 l, 
0 0 p=] 
1 

which gives us the answer x = | —2 
—1 
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Another LU Decomposition Example 


Here we will perform an LU decomposition on the matrix 


1 T 2 
M= |-—3 -21 4 
1 6 3 


following the procedure outlined in Section 7.7.2. So initially we have Lı = 
I} and U; = M, and hence 


1 0 0 1 7 2 
Iz2=|-3 1 0 Uz = | 0 0 10 
1 0 1 0 -1 -l 


However we now have a problem since 0-c = 0 for any value of c since we are 
working over a field, but we can quickly remedy this by swapping the second and 
third rows of Uz to get U} and note that we just interchange the corresponding 
rows all columns left of and including the column we added values to in Lə to 
get LS. Yet this gives us a small problem as L4U} Æ M; in fact it gives us 
the similar matrix M’ with the second and third rows swapped. In our original 
problem MX = V, we also need to make the corresponding swap on our vector 
V to get a V’ since all of this amounts to changing the order of our two 
equations, and note that this clearly does not change the solution. Back to 
our example, we have 


10 0 1 7 2 
L=| 110 U= |0 -1 -1)], 
-3 0 1 0 0 10 


and note that U} is upper triangular. Finally you can easily see that 


1 72 
LU,=| 1 6 3|=M 
-3 —21 4 


which solves the problem of L,U4X = M'X = V'. (We note that as augmented 
matrices (M’|V’) ~ (M|V).) 


Block LDU Explanation 


This video explains how to do a block LDU decomposition. Firstly remember 
some key facts about block matrices: It is important that the blocks fit 
together properly. For example, if we have matrices 


matrix | shape 


X Trxr 
Y rxt 
Z txr 
W txt 
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we could fit these together as a (r + t) x (r+t) square block matrix 


X|Y 
M= . 
Matrix multiplication works for blocks just as for matrix entries: 


wa E E X| Y \_/ X +YZ | XY+YW 
mwas ZW) \ZX+WZ| ZY+W? 


Now lets specialize to the case where the square matrix X has an inverse. 
Then we can multiply out the following triple product of a lower triangular, 


a block diagonal and an upper triangular matrix: 


I |0 xX | 0 I| XY 
ZX |I 0; W-ZxX TY 0; I 


X 

Z 
7 X | Y 
-ZXY +Z|W-ZX "Y 


-k 


This shows that the LDU decomposition given in Section 7.7 is correct. 
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Permutation Example 


Lets try to get the hang of permutations. A permutation is a function which 
scrambles things. Suppose we had 


368 Movie Scripts 


Then we could write this as 


1 2 3 4) |1234 
a(1) o(2) of8) o(4)| 138 2 4 1 
We could write this permutation in two steps by saying that first we swap 3 
and 4, and then we swap 1 and 3. The order here is important. 


This is an even permutation, since the number of swaps we used is two (an even 
number) . 


Elementary Matrices 


This video will explain some of the ideas behind elementary matrices. First 
think back to linear systems, for example n equations in n unknowns: 


alg! +alz? +- +als” = wl 
aja! +aza7+---+a22” = v? 
Nyl N p2 NPN an 
aye +agrl ++--+a,2" = v. 


We know it is helpful to store the above information with matrices and vectors 


qe od 1 1 1 
aed gh 2 2 

ay ay Qh x v 

M := La. Ae a ea 
n n n n n 

ay ay ay, x v 


Here we will focus on the case the M is square because we are interested in 
its inverse M~! (if it exists) and its determinant (whose job it will be to 


determine the existence of M~!). 
We know at least three ways of handling this linear system problem: 


1. As an augmented matrix 


(M|V). 
Here our plan would be to perform row operations until the system looks 
like 

(I| MV ), 


(assuming that M`! exists). 
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2. As a matrix equation 


MX =V, 


which we would solve by finding M! (again, if it exists), so that 
X=M'V. 
3. As a linear transformation 
L:R” — R” 


via 
R” > X+> MXER’”. 
In this case we have to study the equation L(X) =V because V € R”. 


Lets focus on the first two methods. In particular we want to think about 
how the augmented matrix method can give information about finding Mt. In 
particular, how it can be used for handling determinants. 

The main idea is that the row operations changed the augmented matrices, 
but we also know how to change a matrix M by multiplying it by some other 
matrix E, so that M— EM. In particular can we find ‘‘elementary matrices’ ’ 
the perform row operations? 

Once we find these elementary matrices is is very important to ask how they 
effect the determinant, but you can think about that for your own self right 
now. 

Lets tabulate our names for the matrices that perform the various row 
operations: 


Row operation | Elementary Matrix 


To finish off the video, here is how all these elementary matrices work 
for a 2x 2 example. Lets take 
a b 
M = : 


A good thing to think about is what happens to det M = ad — bc under the 
operations below. 


e Row swap: 
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e Scalar multiplying: 


ros). OEIC) 


a=) tame E J-C 9 


Elementary Determinants 


This video will show you how to calculate determinants of elementary matrices. 
First remember that the job of an elementary row matrix is to perform row 
operations, so that if E is an elementary row matrix and M some given matrix, 


EM 


is the matrix M with a row operation performed on it. 
The next thing to remember is that the determinant of the identity is 1. 
Moreover, we also know what row operations do to determinants: 


e Row swap Es: flips the sign of the determinant. 


e Scalar multiplication R'(\): multiplying a row by A multiplies the de- 
terminant by À. 


e Row addition Si (A): adding some amount of one row to another does not 


change the determinant. 


The corresponding elementary matrices are obtained by performing exactly 
these operations on the identity: 
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So to calculate their determinants, we just have to apply the above list 
of what happens to the determinant of a matrix under row operations to the 
determinant of the identity. This yields 


det Ej = —1, det Ayan det S%(A) = 1 


Determinants and Inverses 


Lets figure out the relationship between determinants and invertibility. If 
we have a system of equations Mz = b and we have the inverse M`! then if we 
multiply on both sides we get x = M~'Mx = Mtb. If the inverse exists we 
can solve for x and get a solution that looks like a point. 

So what could go wrong when we want solve a system of equations and get a 
solution that looks like a point? Something would go wrong if we didn’t have 
enough equations for example if we were just given 


eg+y=1 


or maybe, to make this a square matrix M we could write this as 


epy=l 
0=0 
The matrix for this would be M = l l and det(M) = 0. When we compute the 


determinant, this row of all zeros gets multiplied in every term. If instead 
we were given redundant equations 


1 1 
22 


with an elementary row operation, we could replace the second row with a row 


The matrix for this would be M =| | and det(M) = 0. But we know that 


371 


372 


Movie Scripts 


of all zeros. Somehow the determinant is able to detect that there is only one 
equation here. Even if we had a set of contradictory set of equations such as 


t+ty=1 
2x + 2y = 0, 
where it is not possible for both of these equations to be true, the matrix M 
is still the same, and still has a determinant zero. 


Lets look at a three by three example, where the third equation is the sum 
of the first two equations. 


wttytz=1 


y+z=1 
x+2y+2z7=2 


and the matrix for this is 
qi Sh. PL 
M=);0 1 1 
12's 2 


If we were trying to find the inverse to this matrix using elementary 
matrices 


1 1 1|1 0 0 1 1 1 1 0 0 
0 11/0 107)=7; 01 1 0 1 0 
1 2 2|0 0 1 0 0 0|—1i -1 1 


And we would be stuck here. The last row of all zeros cannot be converted 
into the bottom row of a 3 x 3 identity matrix. this matrix has no inverse, 
and the row of all zeros ensures that the determinant will be zero. It can 
be difficult to see when one of the rows of a matrix is a linear combination 
of the others, and what makes the determinant a useful tool is that with this 
reasonably simple computation we can find out if the matrix is invertible, and 
if the system will have a solution of a single point or column vector. 


Alternative Proof 


Here we will prove more directly that the determinant of a product of matrices 
is the product of their determinants. First we reference that for a matrix 
M with rows r;, if M’ is the matrix with rows r} = rj + Àr; for j # i and 
r; = ri, then det(M) = det(M’) Essentially we have M’ as M multiplied by the 
elementary row sum matrices Si (A). Hence we can create an upper-triangular 
matrix U such that det(M) = det(U) by first using the first row to set ml > 0 
for all ¿i > 1, then iteratively (increasing k by 1 each time) for fixed k using 
the k-th row to set m} œ 0 for all i >k. 
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Now note that for two upper-triangular matrices U = (u) and U’ = (u), 
by matrix multiplication we have X = UU’ = (a!) is upper-triangular and 
at = utuli. Also since every permutation would contain a lower diagonal entry 
(which is 0) have det(U) = [J,uj. Let A and A’ have corresponding upper- 
triangular matrices U and U’ respectively (i.e. det(A) = det(U)), we note 
that AA’ has a corresponding upper-triangular matrix UU’, and hence we have 


det(AA’) = det(UU’) = [Tui 


4 


- (IIe) (Ile) 


= det(U) det(U’) = det(A) det(A’). 


Practice taking Determinants 


Lets practice taking determinants of 2 x 2 and 3 x 3 matrices. 


For 2 x 2 matrices we have a formula 


det (: i) =ad-— bc. 


This formula might be easier to remember if you think about this picture. 


Now we can look at three by three matrices and see a few ways to compute 
the determinant. We have a similar pattern for 3 x 3 matrices. Consider the 
example 


det 


O w e 
or bt 


3 
2) =((1-1-1)+(2-2-0)4+ (8-3-0))— (8-1-0) + (1-2-0) +(8-2-1)) = —5 
1 


We can draw a picture with similar diagonals to find the terms that will be 
positive and the terms that will be negative. 
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Another way to compute the determinant of a matrix is to use this recursive 
formula. Here I take the coefficients of the first row and multiply them by 
the determinant of the minors and the cofactor. Then we can use the formula 
for a two by two determinant to compute the determinant of the minors 


3 2 
0 1 


[+a ol = 1(1 —0) — 2(3—0) + 3(0—0) =—5 


2 3 
iaai 3-2 
0 1 


Q 

Oo 

ct 
O w e 


Decide which way you prefer and get good at taking determinants, you’1l need 
to compute them in a lot of problems. 


Hint for Review Problem 6 


i), we have 


For an arbitrary 3 x 3 matrix A= (aj 


12 1,2 1,2 1,2 1,2 1,2 
det(A) = ajaza3 + ajaza? + aja?a3 — ala3a3 — ała?a3 — ała2a3 


and so the complexity is 5a + 12m. Now note that in general, the complexity 
Cn of the expansion minors formula of an arbitrary n x n matrix should be 


Cn = (n — l)a + ncn-ım 


since det(A) = 3>;_.,(—1)'aj cofactor(a}) and cofactor(aj}) is an (n — 1) x (n — 1) 
matrix. This is one way to prove part (c). 
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Linear systems as spanning sets 


Suppose that we were given a set of linear equations Be ye" niga) and we 
want to find out if l/(X) =v! for all j for some vector V = (v). We know that 
we can express this as the matrix equation 


J Ur =u 
i 


where 1 is the coefficient of the variable x’ in the equation JJ. However, this 
is also stating that V is in the span of the vectors {L;i}; where Li = (l})j. For 
example, consider the set of equations 


24+ 3y-z=5 
=% +3y+z=1 
t+y-—-2z=3 


which corresponds to the matrix equation 


2 3 —1 x 5 
-1 3 1 yJ= {1 
1 1 —2 z 3 


We can thus express this problem as determining if the vector 


5 
V= 11 
3 
lies in the span of 
2 3 —1 
—1],{3], 1 
1 1 —2 


Hint for Review Problem 2 


For the first part, try drawing an example in R3: 
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Here we have taken the subspace W to be a plane through the origin and U to 
be a line through the origin. The hint now is to think about what happens when 
you add a vector uc U to a vector we W. Does this live in the union UUW? 

For the second part, we take a more theoretical approach. Lets suppose 
that v EUNW and vw’ CUNW. This implies 


veU and veU. 


So, since U is a subspace and all subspaces are vector spaces, we know that 
the linear combination 
av+ Bv EU. 


Now repeat the same logic for W and you will be nearly done. 


G.9 Linear Independence 


Worked Example 


This video gives some more details behind the example for the following four 
vectors in R Consider the following vectors in R?: 


4 —3 5 —1 
vı = —1 A V2 = T ; v3 = 12 ; U4 = 1 
3 4 17 0 


The example asks whether they are linearly independent, and the answer is 
immediate: NO, four vectors can never be linearly independent in R?. This 
vector space is simply not big enough for that, but you need to understand the 
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notion of the dimension of a vector space to see why. So we think the vectors 
U1, V2, V3 and v4 are linearly dependent, which means we need to show that there 
is a solution to 

Q11 + AgVv2 + A3V3 + A4U4 = 0 


for the numbers a1, Q2, Q3 and ay not all vanishing. 
To find this solution we need to set up a linear system. Writing out the 
above linear combination gives 


4a; —3ag +503 -ayg = 0, 
ay +7a2 +12a3 +04 = 0 ; 
3aı +4a2 +17a3 = 0. 


This can be easily handled using an augmented matrix whose columns are just 
the vectors we started with 


4 -3 5 -l1 
-1 7 1>% 1 
3 4 17 0 


0, 
0, 
0. 


Since there are only zeros on the right hand column, we can drop it. Now we 
perform row operations to achieve RREF 


4 -3 5 -1 L101 i Sa, 
-1 712 i1Jj-jo1 28 3 
3 4 17 0 00 0 0 


This says that a3 and a4 are not pivot variable so are arbitrary, we set them 
to u and v, respectively. Thus 


( 71 7 4 ) ( 53 3 ) 
— — — — V ag = A p SS E Q3 = Q4 ZV. 
e T A 2 25 25 PoI i 
Thus we have found a relationship among our four vectors 
( 71 i 4 ) ( 53 3 ) 4 a = 
25T a BT DE e A EA E EEA 


In fact this is not just one relation, but infinitely many, for any choice of 
H,V. The relationship quoted in the notes is just one of those choices. 

Finally, since the vectors v1, v2, v3 and v4 are linearly dependent, we 
can try to eliminate some of them. The pattern here is to keep the vectors 
that correspond to columns with pivots. For example, setting u =-—1 (say) and 
v =Q in the above allows us to solve for v3 while u =0 and v = —1 (say) gives 
v4, explicitly we get 

Oon 53 _ 4 
U3 = op U1 + op U2» va = Ge V3 t gg t 

This eliminates v3 and v4 and leaves a pair of linearly independent vectors vı 
and v2. 
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Worked Proof 


Here we will work through a quick version of the proof of Theorem 10.1.1. Let 
{uvi} denote a set of linearly dependent vectors, so `; cv; = 0 where there 
exists some c" #0. Now without loss of generality we order our vectors such 
that c! £0, and we can do so since addition is commutative (i.e. a+b=b+a). 
Therefore we have 


n 
ctv = — X cu; 
i=2 
n g 
vi = >, cl Ui 


and we note that this argument is completely reversible since every #0 is 
invertible and 0/¢ = 0. 


Hint for Review Problem 1 


Lets first remember how Z2 works. The only two elements are 1 and 0. Which 
means when you add 1+1 you get 0. It also means when you have a vector vc B” 
and you want to multiply it by a scalar, your only choices are i and 0. This 
is kind of neat because it means that the possibilities are finite, so we can 
look at an entire vector space. 


Now lets think about B? there is choice you have to make for each co- 
ordinate, you can either put a 1 or a 0, there are three places where you 
have to make a decision between two things. This means that you have 2 = 8 
possibilities for vectors in Bè. 


When you want to think about finding a set S that will span B® and is 
linearly independent, you want to think about how many vectors you need. You 
will need you have enough so that you can make every vector in B® using linear 
combinations of elements in S but you don’t want too many so that some of 
them are linear combinations of each other. I suggest trying something really 
simple perhaps something that looks like the columns of the identity matrix 


For part (c) you have to show that you can write every one of the elements 
as a linear combination of the elements in S, this will check to make sure S 
actually spans B®. 


For part (d) if you have two vectors that you think will span the space, 
you can prove that they do by repeating what you did in part (c), check that 
every vector can be written using only copies of of these two vectors. If you 
don’t think it will work you should show why, perhaps using an argument that 
counts the number of possible vectors in the span of two vectors. 


378 


G.10 Basis and Dimension 


379 


G.10 Basis and Dimension 


Proof Explanation 


Lets walk through the proof of theorem 11.0.1. We want to show that for 
S = {v1,...,Un} a basis for a vector space V, then every vector w € V can be 
written uniquely as a linear combination of vectors in the basis S: 


1 q 
w =c v t+: +c"Un. 


We should remember that since S is a basis for V, we know two things 


e V = span S$ 
@ vi,...,Un are linearly independent, which means that whenever we have 
alvı +... +a”Vn =0 this implies that a’ = 0 for all i= 1,...,n. 


This first fact makes it easy to say that there exist constants č such that 
w = clv +: +c”un. What we don’t yet know is that these c',...c” are unique. 

In order to show that these are unique, we will suppose that they are not, 
and show that this causes a contradiction. So suppose there exists a second 
set of constants d’ such that 


w= dvi +--+ +d Up. 


For this to be a contradiction we need to have œ x dt for some i. Then look 
what happens when we take the difference of these two versions of w: 


Oy = w-w 
= (ctv + + cUn) (dtv + + dwn) 
= (d -— djvt: +(e — dyon. 


Since the v;’s are linearly independent this implies that c'—d' =0 for alli, 
this means that we cannot have œ Æ d’, which is a contradiction. 


Worked Example 


In this video we will work through an example of how to extend a set of linearly 
independent vectors to a basis. For fun, we will take the vector space 


V ={(z,y,2,w)|2,y,z,w € Zř}. 


This is like four dimensional space Rt except that the numbers can only be 
{0, 1,2,3,4}. This is like bits, but now the rule is 


0=5. 
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Thus, for example, ; = 4 because 4= 16=1+4+3x5=1. Don’t get too caught up 
on this aspect, its a choice of base field designed to make computations go 
quicker! 

Now, here’s the problem we will solve: 


Find a basis for V that includes the vectors and 


me whd 
rPNwwW oO 


The way to proceed is to add a known (and preferably simple) basis to the 
vectors given, thus we consider 


1 0 1 0 0 0 
Wiese Ne a teres PON dee PEN S oj 0 
LS 3]? 2 2|? ; Ol? “2 — Oo]? 3 1 3 €4 = 0 
4 1 0 0 0 1 
The last four vectors are clearly a basis (make sure you understand this....) 


and are called the canonical basis. We want to keep vı and v2 but find a way to 
turf out two of the vectors in the canonical basis leaving us a basis of four 
vectors. To do that, we have to study linear independence, or in other words 
a linear system problem defined by 


0 = aye, + Q2e2 + 31 + A4v2 + a5e3 + Ages. 


We want to find solutions for the a’s which allow us to determine two of the 
e's. For that we use an augmented matrix 


rwWnNne 
eNO 
eae AE o a en E 
ooreo 
oro oO 
Fe OOo 
ka AG cam Fal a Fe <a} 


Next comes a bunch of row operations. Note that we have dropped the last column 
of zeros since it has no information--you can fill in the row operations used 
above the ~’s as an exercise: 


101000 101000 
230100 033100 
32 0010| ~lo0o2 2010 
410001 011001 
101000 101000 
011200 011200 
“io2 2010| ~lo0o00110 
011001 000301 
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oo co E 
oor oo 
oo rf Ke 
o.oo 
Nr WO 
oo oF 
oono 
oo H j 
o.oo 
.euvo 
U O GT O 


T O Ol 
G l= oOo 
[j 0: 
wne oe 


The pivots are underlined. The columns corresponding to non-pivot variables 
are the ones that can be eliminated--their coefficients (the a’s) will be 
arbitrary, so set them all to zero save for the one next to the vector you are 
solving for which can be taken to be unity. Thus that vector can certainly be 
expressed in terms of previous ones. Hence, altogether, our basis is 


Ae Nhe 
NS) 
oorco 
oreo 


Finally, as a check, note that e; = vı + vg which explains why we had to throw 
it away. 


Hint for Review Problem 2 


Since there are two possible values for each entry, we have |B”| = 2”. We note 
that dim B” =n as well. Explicitly we have B! = {(0),(1)} so there is only 1 
basis for B!. Similarly we have 


0 1 0 1 
B? = 
and so choosing any two non-zero vectors will form a basis. Now in general we 
note that we can build up a basis {e;} by arbitrarily (independently) choosing 


the first i—1 entries, then setting the i-th entry to 1 and all higher entries 
to 0. 


G.11 Eigenvalues and Eigenvectors 


2 x 2 Example 


Here is an example of how to find the eigenvalues and eigenvectors of a 2x 2 


matrix. 
4 2 
u=(i J 


381 


382 


Movie Scripts 


Remember that an eigenvector v with eigenvalue À for M will be a vector such 
that Mv = Wv i.e. M(v)—Al(v) =0. When we are talking about a nonzero v 
then this means that det(M — AI) =0. We will start by finding the eigenvalues 
that make this statement true. First we compute 


dear — Az) = det ( (3 3) -O D) ) = aet (4 a 


so det(M — AI) = (4 — A)\(3— A) — 2-1. We set this equal to zero to find values 
of À that make this true: 


(4—A)(3—A)—2-1=10-—7A4+ 1? = (2—A)(5—A) =0. 


This means that \ = 2 and À = 5 are solutions. Now if we want to find the 
eigenvectors that correspond to these values we look at vectors v such that 


O a 
( 1 g-a] et 


OF ad E 


This gives us the equalities —x+2y = 0 and x—2y = 0 which both give the line 


For A\=5 


2 
y= $a. Any point on this line, so for example ( 


7 ,» 1s an eigenvector with 


eigenvalue \=5. 
Now lets find the eigenvector for \=2 


4—2 2\ fe\ (2 2\ fx -ő 

1 3-2) Ww) U lly ” 
which gives the equalities 2x + 2y = 0 and x+y =0. (Notice that these equa- 
tions are not independent of one another, so our eigenvalue must be correct.) 


1 
This means any vector v = C) where y = —xz , such as ( i) >» or any scalar 


multiple of this vector , t.e. any vector on the line y= —x is an eigenvector 
with eigenvalue 2. This solution could be written neatly as 


Ay = 5, v1, = @ and Ay = 2, v2 = (G 


Jordan Block Example 


Consider the matrix 
Ja A 1 
n 0 À ’ 
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and we note that we can just read off the eigenvector e] with eigenvalue A. 
However the characteristic polynomial of Jz is Pz, (u) = (u — A)? so the only 
possible eigenvalue is À, but we claim it does not have a second eigenvector 
v. To see this, we require that 


Av! +v? = rv! 
du? = Av? 


which clearly implies that v? = 0. This is known as a Jordan 2-cell, and in 
general, a Jordan n-cell with eigenvalue \ is (similar to) the n x n matrix 


A 1 0 <- 0 
0 À 1 `. 0 
In=|: +, . Jo $ 
0 vee 0 A 1 
0 a 0 0 A 


which has a single eigenvector e1. 
Now consider the following matrix 


3 0 
M= 10 1 
0 2 


O w Fe 


and we see that Pm(à) = (A—3)?(A—2). Therefore for À = 3 we need to find the 
solutions to (M — 3J3)v = 0 or in equation form: 


v =0 
v =0 
—v? = 0, 


and we immediately see that we must have V = e,. Next for \ = 2, we need to 
solve (M — 2I3)v = 0 or 


and thus we choose v! = 1, which implies v? = —1 and v? = 1. Hence this is the 
only other eigenvector for M. 
This is a specific case of Problem 13.7. 


Eigenvalues 


Eigenvalues and eigenvectors are extremely important. In this video we review 
the theory of eigenvalues. Consider a linear transformation 


L:V — V 
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where dim V =n < oo. Since V is finite dimensional, we can represent L by a 
square matrix M by choosing a basis for V. 


So the eigenvalue equation 


where v is a column vector and M is an nxn matrix (both expressed in whatever 
basis we chose for V). The scalar À is called an eigenvalue of M and the job 
of this video is to show you how to find all the eigenvalues of M. 

The first step is to put all terms on the left hand side of the equation, 
this gives 


becomes 


(M -Aw =0. 


Notice how we used the identity matrix J in order to get a matrix times v 
equaling zero. Now here comes a VERY important fact 


I.e., a square matrix can have an eigenvector with vanishing eigenvalue if and only if its 
determinant vanishes! Hence 


det(M — AI) =0. 


The quantity on the left (up to a possible minus sign) equals the so-called 
characteristic polynomial 


Pu (à) := det(AI — M). 


It is a polynomial of degree n in the variable \. To see why, try a simple 
2x 2 example 


act ((¢ Ta yj) =a ae ax) = C-A- -te 


which is clearly a polynomial of order 2 in À. For the nxn case, the order n 
term comes from the product of diagonal matrix elements also. 

There is an amazing fact about polynomials called the fundamental theorem 
of algebra: they can always be factored over complex numbers. This means that 
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degree n polynomials have n complex roots (counted with multiplicity). The 
word can does not mean that explicit formulas for this are known (in fact 
explicit formulas can only be give for degree four or less). The necessity 
for complex numbers is easily seems from a polynomial like 


2+ 


whose roots would require us to solve z? = —1 which is impossible for real 


number z. However, introducing the imaginary unit 2 with 


i? = —1 


7 


we have 
z2? +1=(z—i)(z +i). 


Returning to our characteristic polynomial, we call on the fundamental theorem 
of algebra to write 


Pu (A) = (A = A1) (A = A2) + (A — An). 


The roots A1, À2,..., An are the eigenvalues of M (or its underlying linear 
transformation L). 


Eigenspaces 


Consider the linear map 


Direct computation will show that we have 


—1 0 0 
L=Q| 0 2 0| Q7 
0 0 2 
where 
2 1 
Q=ļ|0 0 1 
1 1 0 
Therefore the vectors 
1 1 
v® = 0 vP = 1 
1 0 


span the eigenspace E®) of the eigenvalue 2, and for an explicit example, if 
we take 


1 
CO co eet ae 


y = 2uyp6-—v5° = 
2 
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we have 


sove E?), In general, we note the linearly independent vectors uw with the 


same eigenvalue À span an eigenspace since for any v = |; œv; 


Lv = DD Lu = 5 cidu) =X 5 cy = dv. 


i 


» we have 


Hint for Review Problem 9 


We are looking at the matrix M, and a sequence of vectors starting with 


v(0) = e and defined recursively so that 


y(0) 
vt) = (GD) =™ Gia) 


We first examine the eigenvectors and eigenvalues of 


3 2 
A = 
Ti 
We can find the eigenvalues and vectors by solving 


det(M — \I) = 0 


3-A 2 
aet ( 2 A 


By computing the determinant and solving for \ we can find the eigenvalues \ = 


for À. 


1 and 5, and the corresponding eigenvectors. You should do the computations 
to find these for yourself. 

When we think about the question in part (b) which asks to find a vector 
v(0) such that v(0) = v(1) = v(2)..., we must look for a vector that satisfies 
v = Mv. What eigenvalue does this correspond to? If you found a v(0) with 
this property would cv(0) for a scalar c also work? Remember that eigenvectors 
have to be nonzero, so what if c=0? 

For part (c) if we tried an eigenvector would we have restrictions on what 
the eigenvalue should be? Think about what it means to be pointed in the same 
direction. 
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Non Diagonalizable Example 


First recall that the derivative operator is linear and that we can write it 
as the matrix 


We note that this transforms into an infinite Jordan cell with eigenvalue 0 
or 


ooo 
oo m.e 
oro 
=. oo 


which is in the basis {n-‘'x”}, (where for n = 0, we just have 1). Therefore 
we note that 1 (constant polynomials) is the only eigenvector with eigenvalue 
0 for polynomials since they have finite degree, and so the derivative is 
not diagonalizable. Note that we are ignoring infinite cases for simplicity, 
but if you want to consider infinite terms such as convergent series or all 
formal power series where there is no conditions on convergence, there are 
many eigenvectors. Can you find some? This is an example of how things can 
change in infinite dimensional spaces. 

For a more finite example, consider the space PẸ of complex polynomials of 
degree at most 3, and recall that the derivative D can be written as 


5 

lI 
Sooo 
ooo. 
cono 
owoo 


You can easily check that the only eigenvector is 1 with eigenvalue 0 since D 
always lowers the degree of a polynomial by 1 each time it is applied. Note 
that this is a nilpotent matrix since Dt = 0, but the only nilpotent matrix 
that is ‘‘diagonalizable’’ is the 0 matrix. 


Change of Basis Example 


This video returns to the example of a barrel filled with fruit 


387 


388 


Movie Scripts 


as a demonstration of changing basis. 


Since this was a linear systems problem, we can try to represent what’s in 
the barrel using a vector space. The first representation was the one where 
(x,y) = (apples, oranges): 


(x,y) 


> Apples 


Calling the basis vectors ê} := (1,0) and é):= (0,1), this representation would 
label what’s in the barrel by a vector 


Since this is the method ordinary people would use, we will call this the 
‘*engineer’s’’ method! 


But this is not the approach nutritionists would use. They would note the 
amount of sugar and total number of fruit (s, f): 
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fruit 


> sugar 


WARNING: To make sense of what comes next you need to allow for the possibity 
of a negative amount of fruit or sugar. This would be just like a bank, where 
if money is owed to somebody else, we can use a minus sign. 

The vector 7 says what is in the barrel and does not depend which mathe- 
matical description is employed. The way nutritionists label 7% is in terms of 
a pair of basis vectors fi and fa: 


t=shi+th=(h RCG) 


Thus our vector space now has a bunch of interesting vectors: 


The vector 7 labels generally the contents of the barrel. The vector ĉ corre- 
sponds to one apple and one orange. The vector €2 is one orange and no apples. 
The vector fi means one unit of sugar and zero total fruit (to achieve this 
you could lend out some apples and keep a few oranges). Finally the vector Í 
represents a total of one piece of fruit and no sugar. 

You might remember that the amount of sugar in an apple is called \ while 
oranges have twice as much sugar as apples. Thus 


s = A(x + 2y) 
f=auty. 
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Essentially, this is already our change of basis formula, but lets play around 
and put it in our notations. First we can write this as a matrix 


E 


We can easily invert this to get 


OGE -DG 


Ca (3 3) (7) =(-4@-a) 28; — 22) GE 


Comparing to the nutritionist’s formula for the same object f we learn that 


~ i eee = 5 PA ny 
fi = aes = é2) and fe = 2e — 2€> $ 


Rearranging these equation we find the change of base matrix P from the engi- 
neer’s basis to the nutritionist’s basis: 


1 
(fi È) =(ē& ē) (3 i) =:(& &) P. 
x 


We can also go the other direction, changing from the nutritionist’s basis to 
the engineer’s basis 


@ a=(A AG =A ae 


Of course, we must have 


Q=P", 


(which is in fact how we constructed P in the first place). 

Finally, lets consider the very first linear systems problem, where you 
were given that there were 27 pieces of fruit in total and twice as many oranges 
as apples. In equations this says just 


e+y=27 and 2x—-—y=0. 
But we can also write this as a matrix system 
MX =V 


where 
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Note that 


Also lets call 
T := (ë é2) V. 


Now the matrix M is the matrix of some linear transformation L in the basis 
of the engineers. Lets convert it to the basis of the nutritionists: 


w- AGa ar (A-G). 


Note here that the linear transformation on acts on vectors -- these are the 
objects we have written with a “sign on top of them. It does not act on columns 
of numbers! 

We can easily compute MP and find 


1 Ì/— 2 0 1 
w a)CE a)=(4 a): 
2 -1 = il -5$ 5 
Note that P~!MP is the matrix of L in the nutritionists basis, but we don’t 


need this quantity right now. 
Thus the last task is to solve the system, lets solve for sugar and fruit. 


We need to solve oa m 
we e 5) @)=(a)- 


This is solved immediately by forward substitution (the nutritionists basis 
is nice since it directly gives f): 


f=27 and s= 45). 


2 x 2 Example 


Lets diagonalize the matrix M from a previous example 


An Eigenvalues and Eigenvectors: 2 x 2 Example 


4 2 
ME (; 3) 


We found the eigenvalues and eigenvectors of M, our solution was 


m=s,vi= (7) and =z v= (i) 
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So we can diagonalize this matrix using the formula D = P~'MP where P = 


(vi, v2). This means 
ha i fl: aA 
ae ce 


The inverse comes from the formula for inverses of 2 x 2 matrices: 


-1 
a b 1 d —b 
& ) aoe = J , so long as ad — bc #0. 


p-f AGG -D0 


But this doesn’t really give any intuition into why this happens. Let look 


So we get: 


at what happens when we apply this matrix D = P~!MP to a vector v = n, 


Notice that applying P translates v = C) into 7v; + yv2. 


PMP C) = PM e 
y ay 


= P'(e)Mv, + (y)- Mva] 


Remember that we know what M does to vı and v2, so we get 


P™[(@)Mvi + (y)Mv2] = Po*[(@A1)vi + (yà2)v2] 
(52)P~v, + (2y)P7've 


cn() +20(9 
- 


1 0 
Notice that multiplying by P! converts vı and v2 back in to (5) and (7) 


respectively. This shows us why D = P~!MP should be the diagonal matrix: 
{Ar O\_ (5 0 
p= (4 aJ=( 2) 
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All Orthonormal Bases for R? 


We wish to find all orthonormal bases for the space R?, and they are {e? 8} 


up to reordering where 
6 cos 6 6 —sind 
€) = s » 65 ’ 
sin 0 cos 0 


for some 0 € [0,27). Now first we need to show that for a fixed 0 that the pair 
is orthogonal: 


e? + e8 = — sin 0 cos 0 + cos O sin 8 = 0. 


Also we have 


llef ||? = |le$||? = sin? 6 + cos? 6 = 1, 


and hence {ef e8} is an orthonormal basis. To show that every orthonormal 
basis of R? is {e?,e$} for some 0, consider an orthonormal basis {b1,b2} and 
note that bı forms an angle ¢ with the vector e; (which is e?). Thus bı = ef and 


if bo = ef, we are done, otherwise bo = mee and it is the reflected version. 


However we can do the same thing except starting with b2 and get bo = e% and 
bi = e% since we have just interchanged two basis vectors which corresponds to 


a reflection which picks up a minus sign as in the determinant. 
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A 4 x 4 Gram Schmidt Example 


Lets do an example of how to "Gram-Schmidt" some vectors in Rt. Given the 
following vectors 


o 0 3 1 
o fil o fil _ {0 siege 1 
vS 0 » V2= 1 » U3 = 1 > and v4 = of’ 
0 0 0 2 
we start with vı 
0 
1 
v} = UL = 0 
0 
Now the work begins 
L 
als (vi v2) 1 
v = Vg- ~~ U 
lotl? 
0 0 
| ie ea 
© (1 1] 0 
0 0 
0 
a 0 
E 1 
0 
This gets a little longer with every step. 
vt = o (vt v3) q (vr sus) q 
3 T 1 2 
lor II? loz II? 
3 0 0 3 
= 0 _ oO 1 ai O} 10 
7 1 110 1{1]} {0 
0 0 0 0 
This last step requires subtracting off the term of the form alt for each of 


the previously defined basis vectors. 
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(viva) p (vy sea) p (vp sea) 
"a T ES or? tek? od? 
0 0 3 
7 1/1] Ofo} 3]o 
E 110 1{1 910 
0 0 0 


NOOO AO 


Now vt, vt, vt, and vp are an orthogonal basis. Notice that even with very, 
very nice looking vectors we end up having to do quite a bit of arithmetic. 
This a good reason to use programs like matlab to check your work. 


Another Qk Decomposition Example 


We can alternatively think of the QR decomposition as performing the Gram- 
Schmidt procedure on the column space, the vector space of the column vectors 
of the matrix, of the matrix M. The resulting orthonormal basis will be 
stored in Q and the negative of the coefficients will be recorded in R. Note 
that R is upper triangular by how Gram-Schmidt works. Here we will explicitly 
do an example with the matrix 


| | | 1 1 —1 
= mı m2 ms — 0 1 2 
| | | -1 1 1 
First we normalize mı to get m= [my where |mil| = rt = V2 which gives the 
decomposition 
wy i -l v2 0 0 
Qi =— 0 1 2 ; Ry = 0 1 0 
1 
-z 1 1 0 0 1 


Next we find 


pa 1 ' SN ji ROIN A 1 
t2 =m — (mM) * Mm2)m} = M2 — romi = m2 — Omi 


noting that 
my +m; = |m; ||’ =1 


2 


and ||t2|| = r2 = v3, and so we get m} = TA 


lta] 


with the decomposition 


1 1 
iw -1 V2 00 
Qo=| 9 Fw 2f, R=| 0 V3 0 
-3 A 1 0 0 1 
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Finally we calculate 
t3 = ms — (m4 + mg)mj — (my * ms)ms 


i RI 2! 1 1 
= Mg — 73M, — r3M3 = M3 + V2m/, = WE 


again noting m4 mh = |m] = 1, and let m4 = eI where ||¢3|| = r3 = ay/2. Thus 


we get our final M = QR decomposition as 


TeL 2 1 

Fae. oa i ae 
= 1 2 p -F 
Q=| 0 3 5j R= Z 

ae 0 0 2/2 


Overview 


This video depicts the ideas of a subspace sum, a direct sum and an orthogonal 
complement in R3. Firstly, lets start with the subspace sum. Remember that 
even if U and V are subspaces, their union UUV is usually not a subspace. 
However, the span of their union certainly is and is called the subspace sum 


U +V = span(U UV). 


You need to be aware that this is a sum of vector spaces (not vectors). A 
picture of this is a pair of planes in R3: 


V 


Here U+V =R. 
Next lets consider a direct sum. This is just the subspace sum for the 
case when UMV = {0}. For that we can keep the plane U but must replace V by 


a line: 


V 
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Taking a direct sum we again get the whole space, U Ẹ§ V =R. 

Now we come to an orthogonal complement. There is not really a notion of 
subtraction for subspaces but the orthogonal complement comes close. Given U 
it provides a space U+ such that the direct sum returns the whole space: 


USUt=R’*. 


The orthogonal complement UŁ is the subspace made from all vectors perpen- 
dicular to any vector in U. Here, we need to just tilt the line V above until 
it hits U at a right angle: 


Notice, we can apply the same operation to U+ and just get U back again, i.e. 


(U+)~ =U. 


Hint for Review Question 2 


You are asked to consider an orthogonal basis {v1,v2,-.-Un}. Because this is a 
basis any v € V can be uniquely expressed as 


v=clu, + eU +--+ ue, , 
and the number n = dim V. Since this is an orthogonal basis 
utu = 0, TAJ: 


So different vectors in the basis are orthogonal: 


V. 
l 
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However, the basis is not orthonormal so we know nothing about the lengths of 


the 


basis vectors (save that they cannot vanish). 


To complete the hint, lets use the dot product to compute a formula for g 


in terms of the basis vectors and v. Consider 


v tv = clv up + eu u? tee H eo Un = clo eo. 


Solving for c! (remembering that vı * v, 40) gives 


1 Uj TẸ 


Up Ut 


This should get you started on this problem. 


Hint for Review Problem 3 


Lets work part by part: 


(a) 


(b) 


(c) 


aL 


Is the vector v- =v— tu in the plane P? 


Remember that the dot product gives you a scalar not a vector, so if you 
think about this formula ua is a scalar, so this is a linear combination 
of v and u. Do you think it is in the span? 


L 


What is the angle between v~ and u? 


This part will make more sense if you think back to the dot product for- 
mulas you probably first saw in multivariable calculus. Remember that 


u- v = |lull||v|| cos(8), 


and in particular if they are perpendicular 0 = 3 and cos(35) = Q you will 
get u-v=0. 
Now try to compute the dot product of u and vt to find ||uļ|||v+]|| cos(0) 


Now you finish simplifying and see if you can figure out what 0 has to be. 


Given your solution to the above, how can you find a third vector perpen- 
dicular to both u and vt? 


Remember what other things you learned in multivariable calculus? This 
might be a good time to remind your self what the cross product does. 
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(d) Construct an orthonormal basis for R from u and v. 


If you did part (c) you can probably find 3 orthogonal vectors to make 
a orthogonal basis. All you need to do to turn this into an orthonormal 
basis is make these into unit vectors. 


(e) Test your abstract formulae starting with 


u =(1 2 0) and v = (0 1 1). 


Try it out, and if you get stuck try drawing a sketch of the vectors you 
have. 


Hint for Review Problem 9 


This video shows you a way to solve problem 9 that’s different to the method 
described in the Lecture. The first thing is to think of 


1 0 2 
M=j|{-1 2 0 
—-1 2 2 
as a set of 3 vectors 
0 0 2 
vı = —1 ; v2 = 2 5 v3 = 0 
—1 —2 2 


Then you need to remember that we are searching for a decomposition 
M=QR 


where Q is an orthogonal matrix. Thus the upper triangular matrix R = QTM 
and QTQ = I. Moreover, orthogonal matrices perform rotations. To see this 


compare the inner product u*v = u'v of vectors u and v with that of Qu and 


Qv: 
(Qu) + (Qu) = (Qu)? (Qu) = u7™Q7 Qu = uv =u +v. 


Since the dot product doesn’t change, we learn that Q does not change angles 
or lengths of vectors. 

Now, here’s an interesting procedure: rotate v1, V2 and v3 such that vı is 
along the x-axis, vo is in the ry-plane. Then if you put these in a matrix you 
get something of the form 


b 
d 
0 


ooe 
SO 0 


which is exactly what we want for R! 
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Moreover, the vector 


© 


is the rotated vı so must have length ||v;||= /3. Thus a = v3. 
The rotated vz is 


b 
d 
0 


and must have length ||v2|| = 2V2. Also the dot product between 


a b 
0] and |d 
0 0 
is ab and must equal vı *v2 = 0. (That vı and v2 were orthogonal is just a 
coincidence here... .) Thus b= 0. So now we know most of the matrix R 
v3 0 c 
R=| 0 2⁄2 e 
0 0 f 


You can work out the last column using the same ideas. Thus it only remains to 
compute Q from 


Q=MR +. 


G.14 Diagonalizing Symmetric Matrices 


3 x 3 Example 


Lets diagonalize the matrix 
1 2 0 
M=|2 1 0 
0 0 5 


If we want to diagonalize this matrix, we should be happy to see that it 
is symmetric, since this means we will have real eigenvalues, which means 
factoring won’t be too hard. As an added bonus if we have three distinct 
eigenvalues the eigenvectors we find will automatically be orthogonal, which 
means that the inverse of the matrix P will be easy to compute. We can start 
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by finding the eigenvalues of this 


1—A 2 0 


det} 2 1-rA 0 = 0-315 yl 
T E a9 - 

2 0 E 

ol soa +9 0 


(1=—A)(1—A)\(5— A) + (—2)(2)(5— à) +0 
(1 — 2X +.A7)(5 — à) + (—2)(2)(5 — A) 

= ((1—4)—2\+A7)(5 —A) 
( 
( 


—3 — 24+ 7)(5— A) 
1+A)(3—A)(5—A) 


So we get À = —1,3,5 as eigenvectors. First find vı for 4; = —1 
x 2 2 O\ /a 0 
(M+I){y|]= {2 2 0 y)= [0], 
z 0 0 6) \z 0 
implies that 2x + 2y = 0 and 6z = 0,which means any multiple of vı = | —1 | is 
0 
an eigenvector with eigenvalue Ay =—1. Now for vg with Ag = 3 
x —2 2 O\ fa 0 
(M-3I)|yļ=| 2 -2 0 yl = {O0], 
z 0 O 4) \z 0 
1 
and we can find that that v = | 1 | would satisfy —2x + 2y = 0, 2% — 2y = 0 and 
0 
4z=0. 
Now for v3 with A3 =5 
x —4 2 O\ /z 0 
(M-5I)|yļ]=| 2 —4 0 yļ=10], 
A 0 O OF \z 0 
Now we want v3 to satisfy —4x + 2y = 0 and 2x — 4y = 0, which imply x=y=0, 
0 
but since there are no restrictions on the z coordinate we have v3 = | 0 
1 


Notice that the eigenvectors form an orthogonal basis. We can create an 
orthonormal basis by rescaling to make them unit vectors. This will help us 
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because if P = [v),v2,u3] is created from orthonormal vectors then P-t = PT, 
which means computing P-' should be easy. So lets say 


A 4 
4 y 5 
= |e | = | Ze | a and v3 = | 0 
0 0 1 
so we get 
1 19 4 1 9 
a Aaa 
P= Ta aa 0] and P= Va Vi 0 
0 0 1 0 0 1 
So when we compute D = P~!MP we’11 get 
1 1 1 1 
va va 0 1 2 0 v3 va —1 0 0 
Fe = 0 2 5 0 -5 = OO] = 0 3 0 
0 6 vloo sho Ti 005 


Hint for Review Problem 1 
For part (a), we can consider any complex number z as being a vector in R? where 
complex conjugation corresponds to the matrix G ae Can you describe zZ 


in terms of ||z||? For part (b), think about what values a € R can take if 
a = —a? Part (c), just compute it and look back at part (a). 

For part (d), note that aig is just a number, so we can divide by it. 
Parts (e) and (f) follow right from definitions. For part (g), first notice 
that every row vector is the (unique) transpose of a column vector, and also 
think about why (AAT)T = AA” for any matrix A. Additionally you should see 
that zT =at and mention this. Finally for part (h), show that 


alMa (ame) 


ale alex 


and reduce each side separately to get \= À. 


G.15 Kernel, Range, Nullity, Rank 


Invertibility Conditions 


Here I am going to discuss some of the conditions on the invertibility of a 
matrix stated in Theorem 16.1.1. Condition 1 states that X = M~'V uniquely, 
which is clearly equivalent to 4. Similarly, every square matrix M uniquely 
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corresponds to a linear transformation L: R” > R”, so condition 3 is equiva- 
lent to condition 1. 

Condition 6 implies 4 by the adjoint construct the inverse, but the con- 
verse is not so obvious. For the converse (4 implying 6), we refer back the 
proofs in Chapter 18 and 19. Note that if det M = 0, there exists an eigen- 
value of M equal to 0, which implies M is not invertible. Thus condition 8 
is equivalent to conditions 4, 5, 9, and 10. 

The map M is injective if it does not have a null space by definition, 
however eigenvectors with eigenvalue 0 form a basis for the null space. Hence 
conditions 8 and 14 are equivalent, and 14, 15, and 16 are equivalent by the 
Dimension Formula (also known as the Rank-Nullity Theorem). 

Now conditions 11, 12, and 13 are all equivalent by the definition of a 
basis. Finally if a matrix M is not row-equivalent to the identity matrix, 
then det M =0, so conditions 2 and 8 are equivalent. 


Hint for Review Problem 2 


Lets work through this problem. 


Let L: V > W be a linear transformation. Show that ker L = {Oy} if and 
only if L is one-to-one: 


1. First, suppose that ker L = {0y}. Show that L is one-to-one. 


Remember what one-one means, it means whenever L(x) = L(y) we can be 
certain that x = y. While this might seem like a weird thing to require 
this statement really means that each vector in the range gets mapped to 
a unique vector in the range. 


We know we have the one-one property, but we also don’t want to forget 
some of the more basic properties of linear transformations namely that 
they are linear, which means L(ax + by) =aL(x)+bL(y) for scalars a and 
b. 


What if we rephrase the one-one property to say whenever L(x) — L(y) =0 
implies that «—y=0? Can we connect that to the statement that ker L = 
{Ov}? Remember that if L(v) =0 then v € ker L = {0y}. 


2. Now, suppose that L is one-to-one. Show that ker L = {0y}. That is, show 
that Oy is in kerZ, and then show that there are no other vectors in 


ker L. 


What would happen if we had a nonzero kernel? If we had some vector v 
with L(v) =0 and v #0, we could try to show that this would contradict 
the given that L is one-one. If we found x and y with L(x) = L(y), then 
we know xz = y. But if L(v) =0 then L(x) + L(v) = L(y). Does this cause a 
problem? 
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G.16 Least Squares and Singular Values 


Least Squares: Hint for Review Problem 1 


Lets work through this problem. Let L: U —> V be a linear transformation. 
Suppose v € L(U) and you have found a vector ups that obeys L(ups) =v. 

Explain why you need to compute ker L to describe the solution space of the 
linear system L(u) =v. 

Remember the property of linearity that comes along with any linear trans- 
formation: L(ax + by) = aL(x) + bL(y) for scalars a and b. This allows us to 
break apart and recombine terms inside the transformation. 

Now suppose we have a solution x where L(x) = v. If we have an vector 
y € ker L then we know L(y) =0. If we add the equations together L(x) + L(y) = 
L(a#+y) =v+0 we get another solution for free. Now we have two solutions, 
is that all? 


Hint for Review Problem 2 


For the first part, what is the transpose of a 1 x 1 matrix? For the other two 
parts, note that v*v =vľv. Can you express this in terms of ||v||? Also you 
need the trivial kernel only for the last part and just think about the null 
space of M. It might help to substitute w= Mz. 
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Dimension formula, 269 an example, 59 
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Dot product, 78 


Dual vector space, 331 Identity matrix, 121 
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Eigenvalue, 211, 215 Inverse Matrix, 47 
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Graph theory, 116 Linearly dependent, 188 
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Homogeneous solution Lower triangular matrix, 141 
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Newton’s Principiz, 318 
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Nonsingular, 132 
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Nullity, 269 
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Pre-image, 264 
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Rank, 268 
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